Chapter 3: Transport Layerhome.cse.ust.hk/.../notes/chapter3_spr05_v4.pdf · Comp 361, Spring 2005 3: Transport Layer 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing

Post on 18-Aug-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

3 Transport Layer 1Comp 361 Spring 2005

Chapter 3 Transport Layer last revised 160305

Chapter goalsunderstand principles behind transport layer services

multiplexingdemultiplexingreliable data transferflow controlcongestion control

instantiation and implementation in the Internet

Chapter Overviewtransport layer servicesmultiplexingdemultiplexingconnectionless transport UDPprinciples of reliable data transferconnection-oriented transport TCP

reliable transferflow controlconnection management

principles of congestion controlTCP congestion control

3 Transport Layer 2Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 3Comp 361 Spring 2005

Transport services and protocolsprovide logical communicationbetween app processes running on different hoststransport protocols run in end systems

send side breaks app messages into segments passes to network layerrcv side reassembles segments into messages passes to app layer

more than one transport protocol available to apps

Internet TCP and UDP

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

logical end-end transport

3 Transport Layer 4Comp 361 Spring 2005

Transport vs network layerHousehold analogy12 kids sending letters

to 12 kidsprocesses = kidsapp messages = letters in envelopeshosts = housestransport protocol = Ann and Billnetwork-layer protocol = postal service

network layer logical communication between hoststransport layer logical communication between processes

relies on enhances network layer services

3 Transport Layer 5Comp 361 Spring 2005

Transport-layer protocols

Internet transport servicesreliable in-order unicastdelivery (TCP)

congestion flow controlconnection setup

unreliable (ldquobest-effortrdquo) unordered unicast or multicast delivery UDPservices not available

real-timebandwidth guaranteesreliable multicast

applicationtransportnetworkdata linkphysical

applicationtransportnetworkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysical

networkdata linkphysicalnetwork

data linkphysical

logical end-end transport

3 Transport Layer 6Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 7Comp 361 Spring 2005

Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

Multiplexing at send host

delivering received segmentsto correct socket

Demultiplexing at rcv host

= socket = process

application

transport

network

link

physical

P1 application

transport

network

link

physical

application

transport

network

link

physical

P2P3 P4P1

host 1 host 2 host 3

3 Transport Layer 8Comp 361 Spring 2005

Multiplexingdemultiplexingsegment - unit of data

exchanged between transport layer entities

aka TPDU transport protocol data unit

Demultiplexing delivering received segments to correct app layer processes

receiver

applicationtransportnetwork

M P2applicationtransportnetwork

HtHn segment

segment Mapplicationtransportnetwork

P1M

M MP3 P4

segmentheader

application-layerdata

3 Transport Layer 9Comp 361 Spring 2005

How demultiplexing workshost receives IP datagrams

each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

host uses IP addresses amp port numbers to direct segment to appropriate socket

source port dest port

32 bits

applicationdata

(message)

other header fields

TCPUDP segment format

3 Transport Layer 10Comp 361 Spring 2005

Connectionless demultiplexingWhen host receives UDP segment

checks destination port number in segmentdirects UDP segment to socket with that port number

IP datagrams with different source IP addresses andor source port numbers directed to same socket

Create sockets with port numbers

DatagramSocket mySocket1 = new DatagramSocket(99111)

DatagramSocket mySocket2 = new DatagramSocket(99222)

UDP socket identified by two-tuple

(dest IP address dest port number)

3 Transport Layer 11Comp 361 Spring 2005

Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

ClientIPB

P3

clientIP A

P1P1P3

serverIP C

SP 6428DP 9157

SP 9157DP 6428

SP 6428DP 5775

SP 5775DP 6428

SP provides ldquoreturn addressrdquo

3 Transport Layer 12Comp 361 Spring 2005

Connection-oriented demux

TCP socket identified by 4-tuple

source IP addresssource port numberdest IP addressdest port number

recv host uses all four values to direct segment to appropriate socket

Server host may support many simultaneous TCP sockets

each socket identified by its own 4-tuple

Web servers have different sockets for each connecting client

non-persistent HTTP will have different socket for each request

3 Transport Layer 13Comp 361 Spring 2005

Connection-oriented demux(cont)

ClientIPB

P3

clientIP A

P1P1P3

serverIP C

SP 80DP 9157

SP 9157DP 80

SP 80DP 5775

SP 5775DP 80

P4

3 Transport Layer 14Comp 361 Spring 2005

Connection-oriented demux Threaded Web Server

ClientIPB

P1

clientIP A

P1P2

serverIP C

SP 9157DP 80

SP 9157DP 80

P4 P3

D-IPCS-IP AD-IPC

S-IP B

SP 5775DP 80

D-IPCS-IP B

3 Transport Layer 15Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 16Comp 361 Spring 2005

UDP User Datagram Protocol [RFC 768]

ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

lostdelivered out of order to app

connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

3 Transport Layer 17Comp 361 Spring 2005

UDP moreoften used for streaming multimedia apps

loss tolerantrate sensitive

other UDP uses (why)

DNS small delaySNMP stressful cond

reliable transfer over UDP add reliability at application layer

application-specific error recover

source port dest port

32 bits

Applicationdata

(message)

length checksumLength in

bytes of UDPsegmentincluding

header

UDP segment format

3 Transport Layer 18Comp 361 Spring 2005

UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

segment

Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

NO - error detectedYES - no error detected But maybe errors nonetheless More later

Receiver may choose to discard segment or send a warning to app in case error

Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

3 Transport Layer 19Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 20Comp 361 Spring 2005

Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

3 Transport Layer 21Comp 361 Spring 2005

Reliable data transfer getting started

sendside

receiveside

rdt_send() called from above (eg by app) Passed data to

deliver to receiver upper layer

udt_send() called by rdtto transfer packet over

unreliable channel to receiver

rdt_rcv() called when packet arrives on rcv-side of channel

deliver_data() called by rdt to deliver data to upper

3 Transport Layer 22Comp 361 Spring 2005

Reliable data transfer getting startedWersquoll

incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

state1

state2

event causing state transitionactions taken on state transition

state when in this ldquostaterdquo next state

uniquely determined by next event

eventactions

3 Transport Layer 23Comp 361 Spring 2005

Incremental Improvements

rdt10 assumes every packet sent arrives and no errors introduced in transmission

rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

rdt21 deals with corrupted ACKSNAKS

rdt22 like rdt21 but does not need NAKs

Rdt30 Allows packets to be lost

Rdt10 reliable transfer over a reliable channel

underlying channel perfectly reliableno bit errorsno loss of packets

separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

Wait for call from above packet = make_pkt(data)

udt_send(packet)

rdt_send(data)extract (packetdata)deliver_data(data)

Wait for call from

below

rdt_rcv(packet)

sender receiver

3 Transport Layer 24Comp 361 Spring 2005

3 Transport Layer 25Comp 361 Spring 2005

Rdt20 channel with bit errors

underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

3 Transport Layer 26Comp 361 Spring 2005

rdt20 FSM specification

Wait for call from above

snkpkt = make_pkt(data checksum)udt_send(sndpkt)

extract(rcvpktdata)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

Wait for ACK or

NAK

rdt_send(data)

receiver

Wait for call from

below

Λ

sender

3 Transport Layer 27Comp 361 Spring 2005

rdt20 operation with no errors

Wait for call from above

snkpkt = make_pkt(data checksum)udt_send(sndpkt)

extract(rcvpktdata)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

Λ

3 Transport Layer 28Comp 361 Spring 2005

rdt20 error scenario

Wait for call from above

snkpkt = make_pkt(data checksum)udt_send(sndpkt)

extract(rcvpktdata)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

Λ

3 Transport Layer 29Comp 361 Spring 2005

rdt20 has a fatal flawWhat happens if ACKNAK

corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

Sender sends one packet then waits for receiver response

stop and wait

3 Transport Layer 30Comp 361 Spring 2005

Sender whenever sender receives control message it sends a packet to receiver

A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

Note ACKNAK do not contain sequence

3 Transport Layer 31Comp 361 Spring 2005

rdt21 sender handles garbled ACKNAKs

Wait for call 0 from

above

sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

rdt_send(data)

Wait for ACK or NAK 0 udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

rdt_send(data)

udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

Wait forcall 1 from

above

Wait for ACK or NAK 1

ΛΛ

3 Transport Layer 32Comp 361 Spring 2005

rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

ampamp has_seq0(rcvpkt)

Wait for 0 from below

sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

Wait for 1 from below

extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

3 Transport Layer 33Comp 361 Spring 2005

rdt21 discussion

Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

Receivermust check if received packet is duplicate

state indicates whether 0 or 1 is expected pkt seq

note receiver can notknow if its last ACKNAK received OK at sender

3 Transport Layer 34Comp 361 Spring 2005

rdt22 a NAK-free protocol

same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

duplicate ACK at sender results in same action as NAK retransmit current pkt

3 Transport Layer 35Comp 361 Spring 2005

rdt22 sender receiver fragments

Wait for call 0 from

above

sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

rdt_send(data)

udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

isACK(rcvpkt1) )

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

Wait for ACK

0sender FSM

fragment

Wait for 0 from below

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

has_seq1(rcvpkt))

udt_send(sndpkt)receiver FSM

fragment

Λ

3 Transport Layer 36Comp 361 Spring 2005

rdt30 channels with errors and loss

New assumptionunderlying channel can also lose packets (data or ACKs)

checksum seq ACKs retransmissions will be of help but not enough

Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

requires countdown timer

3 Transport Layer 37Comp 361 Spring 2005

rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

rdt_send(data)

Wait for

ACK0

rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

Wait for call 1 from

above

sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

rdt_send(data)

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

stop_timerstop_timer

udt_send(sndpkt)start_timer

timeout

udt_send(sndpkt)start_timer

timeout

rdt_rcv(rcvpkt)

Wait for call 0from

above

Wait for

ACK1

Λrdt_rcv(rcvpkt)

ΛΛ

Λ

3 Transport Layer 38Comp 361 Spring 2005

rdt30 in action

3 Transport Layer 39Comp 361 Spring 2005

rdt30 in action

3 Transport Layer 40Comp 361 Spring 2005

Performance of rdt30

rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

L (packet length in bits)R (transmission rate bps)

8kbpkt109 bsec

Ttransmit = = = 8 microsec

U sender =

00830008

= 000027 L R RTT + L R

=

U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

rdt30 stop-and-wait operation

first packet bit transmitted t = 0

sender receiver

RTT

last packet bit transmitted t = L R

first packet bit arriveslast packet bit arrives send ACK

ACK arrives send next packet t = RTT + L R

U sender =

008 30008

= 000027 L R RTT + L R

=

3 Transport Layer 41Comp 361 Spring 2005

3 Transport Layer 42Comp 361 Spring 2005

Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

range of sequence numbers must be increasedbuffering at sender andor receiver

3 Transport Layer 43Comp 361 Spring 2005

Pipelined protocols

Advantage much better bandwidth utilization than stop-and-wait

Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

Note TCP is not exactly either

Pipelining increased utilization

first packet bit transmitted t = 0

sender receiver

RTT

last bit transmitted t = L R

first packet bit arriveslast packet bit arrives send ACK

ACK arrives send next packet t = RTT + L R

last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

U sender =

02430008

= 00008 3 L R RTT + L R

=

Increase utilizationby a factor of 3

3 Transport Layer 44Comp 361 Spring 2005

3 Transport Layer 45Comp 361 Spring 2005

Go-Back-NSender

k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

3 Transport Layer 46Comp 361 Spring 2005

GBN Sender

rdt_Send() called checks to see if window is full No send out packetYes return data to application level

Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

Timeout resends ALL packets that have been sent but not yet acknowledged

This is only event that triggers resend

3 Transport Layer 47Comp 361 Spring 2005

GBN sender extended FSMrdt_send(data)

Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

timeout

if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

start_timernextseqnum++

elserefuse_data(data)

base = getacknum(rcvpkt)+1If (base == nextseqnum)

stop_timerelse

start_timer

rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

base=1nextseqnum=1

rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

Λ

3 Transport Layer 48Comp 361 Spring 2005

GBN receiver extended FSM

Wait

udt_send(sndpkt)default

rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

expectedseqnum=1sndpkt =

make_pkt(0ACKchksum)

Λ

If expected packet receivedSend ACK and deliver packet upstairs

If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

3 Transport Layer 49Comp 361 Spring 2005

More on receiver

The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

3 Transport Layer 50Comp 361 Spring 2005

GBN inaction

GBN is easy to code but might have performance problems

In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

3 Transport Layer 51Comp 361 Spring 2005

3 Transport Layer 52Comp 361 Spring 2005

Selective Repeat

receiver individually acknowledges all correctly received pkts

buffers pkts as needed for eventual in-order delivery to upper layer

sender only resends pkts for which ACK not received

sender timer for each unACKed pktCompare to GBN which only had timer for base packet

sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

3 Transport Layer 53Comp 361 Spring 2005

Selective repeat sender receiver windows

3 Transport Layer 54Comp 361 Spring 2005

Selective repeat

pkt n in [rcvbase rcvbase+N-1]

send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

pkt n in [rcvbase-Nrcvbase-1]

ACK(n) (note this is a reACK)

otherwiseignore

receiverdata from above

if next available seq in window send pkt

timeout(n)resend pkt n restart timer

ACK(n) in [sendbasesendbase+N]

mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

sender

3 Transport Layer 55Comp 361 Spring 2005

Selective repeat in action

3 Transport Layer 56Comp 361 Spring 2005

Selective repeatdilemma

Example seq rsquos 0 1 2 3window size=3

receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

Q what is relationship between seq size and window size

3 Transport Layer 57Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 58Comp 361 Spring 2005

TCP Overview RFCs 793 1122 1323 2018 2581

full duplex databi-directional data flow in same connectionMSS maximum segment size

connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

flow controlledsender will not overwhelm receiver

point-to-pointone sender one receiver

reliable in-order byte steam

no ldquomessage boundariesrdquopipelined

TCP congestion and flow control set window size

send amp receive buffers

socketdoor

TCPsend buffer

TCPreceive buffer

socketdoor

segment

applicationwrites data

applicationreads data

3 Transport Layer 59Comp 361 Spring 2005

More TCP DetailsMaximum Segment Size (MSS)

Depends upon implementation (can often be set)The Max amount of application-layer data in segment

Application Data + TCP Header = TCP Segment

Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

(again no payload)Client responds with third special segment

This can contain payload

3 Transport Layer 60Comp 361 Spring 2005

Even More TCP Details

A TCP connection between client and server creates in both client and server

(i) buffers(ii) variables and

(iii) a socket connection to process

TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

any of the network elements between the host and server

3 Transport Layer 61Comp 361 Spring 2005

TCP segment structure

source port dest port

32 bits

applicationdata

(variable length)

sequence numberacknowledgement number

Receive windowUrg data pnterchecksum

FSRPAUheadlen

notused

Options (variable length)

URG urgent data (generally not used)

ACK ACK valid

PSH push data now(generally not used)

RST SYN FINconnection estab(setup teardown

commands)

bytes rcvr willingto accept

Internetchecksum

(as in UDP)

countingby bytes of data(not segments)

3 Transport Layer 62Comp 361 Spring 2005

TCP seq rsquos and ACKsSeq rsquos

byte stream ldquonumberrdquo of first byte in segmentrsquos data

ACKsseq of next byte expected from other sidecumulative ACK

Q how receiver handles out-of-order segments

A TCP spec doesnrsquot say - up to implementer

Host BHost A

Seq=42 ACK=79 data = lsquoCrsquo

Seq=79 ACK=43 data = lsquoCrsquo

Seq=43 ACK=80

Usertypes

lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

back lsquoCrsquo

host ACKsreceipt

of echoedlsquoCrsquo

timesimple telnet scenario

3 Transport Layer 63Comp 361 Spring 2005

TCP Round Trip Time and Timeout

Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

average several recent measurements not just current SampleRTT

Q how to set TCP timeout valuelonger than RTT

but RTT variestoo short premature timeout

unnecessary retransmissions

too long slow reaction to segment loss

3 Transport Layer 64Comp 361 Spring 2005

TCP Round Trip Time and Timeout

EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

3 Transport Layer 65Comp 361 Spring 2005

Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

100

150

200

250

300

350

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

time (seconnds)

RTT

(mill

iseco

nds)

SampleRTT Estimated RTT

3 Transport Layer 66Comp 361 Spring 2005

TCP Round Trip Time and Timeout

Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

(typically β = 025)

Then set timeout interval

TimeoutInterval = EstimatedRTT + 4DevRTT

3 Transport Layer 67Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 68Comp 361 Spring 2005

TCP reliable data transfer

TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

Retransmissions are triggered by

timeout eventsduplicate acks

Initially consider simplified TCP sender

ignore duplicate acksignore flow control congestion control

3 Transport Layer 69Comp 361 Spring 2005

TCP sender eventsdata rcvd from app

Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

timeoutretransmit segment that caused timeoutrestart timer

Ack rcvdIf acknowledges previously unackedsegments

update what is known to be ackedstart timer if there are outstanding segments

TCP sender(simplified)

NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

loop (forever) switch(event)

event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

event timer timeoutretransmit not-yet-acknowledged segment with

smallest sequence numberstart timer

event ACK received with ACK field value of y if (y gt SendBase)

SendBase = yif (there are currently not-yet-acknowledged segments)

start timer

end of loop forever

Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

3 Transport Layer 70Comp 361 Spring 2005

3 Transport Layer 71Comp 361 Spring 2005

TCP retransmission scenariosHost A

Seq=100 20 bytes data

ACK=100

timepremature timeout

Host B

Seq=92 8 bytes data

ACK=120

Seq=92 8 bytes data

Seq=

92 t

imeo

ut

ACK=120

Host A

Seq=92 8 bytes data

ACK=100

loss

tim

eout

lost ACK scenario

Host B

X

Seq=92 8 bytes data

ACK=100

time

SendBase= 120

SendBase= 120

Sendbase= 100

Seq=

92 t

imeo

utSendBase

= 100

3 Transport Layer 72Comp 361 Spring 2005

TCP retransmission scenarios (more)Host A

Seq=92 8 bytes data

ACK=100

loss

tim

eout

Cumulative ACK scenario

Host B

X

Seq=100 20 bytes data

ACK=120

time

SendBase= 120

3 Transport Layer 73Comp 361 Spring 2005

TCP ACK generation [RFC 1122 RFC 2581]

Event at Receiver

Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

Arrival of in-order segment withexpected seq One other segment has ACK pending

Arrival of out-of-order segmenthigher-than-expect seq Gap detected

Arrival of segment that partially or completely fills gap

TCP Receiver action

Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

Immediately send single cumulative ACK ACKing both in-order segments

Immediately send duplicate ACK indicating seq of next expected byte

Immediate send ACK provided thatsegment starts at lower end of gap

3 Transport Layer 74Comp 361 Spring 2005

More on Sender Policies

Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

3 Transport Layer 75Comp 361 Spring 2005

Fast Retransmit

Time-out period often relatively long

long delay before resending lost packet

Detect lost segments via duplicate ACKs

Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

fast retransmit resend segment before timer expires

3 Transport Layer 76Comp 361 Spring 2005

Fast retransmit algorithm

event ACK received with ACK field value of y if (y gt SendBase)

SendBase = yif (there are currently not-yet-acknowledged segments)

start timer

else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

resend segment with sequence number y

a duplicate ACK for already ACKed segment

fast retransmit

3 Transport Layer 77Comp 361 Spring 2005

TCP GBN or Selective Repeat

Basic TCP looks a lot like GBN

Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

This looks a lot like Selective Repeat

TCP is a hybrid

3 Transport Layer 78Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 79Comp 361 Spring 2005

TCP Flow Control

Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

3 Transport Layer 80Comp 361 Spring 2005

TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

transmitting too muchtoo fast

flow controlreceive side of TCP connection has a receive buffer

speed-matching service matching the send rate to the receiving apprsquos drain rate

app process may be slow at reading from buffer

3 Transport Layer 81Comp 361 Spring 2005

TCP segment structure

source port dest port

32 bits

applicationdata

(variable length)

sequence numberacknowledgement number

Receive windowUrg data pnterchecksum

FSRPAUheadlen

notused

Options (variable length)

URG urgent data (generally not used)

ACK ACK valid

PSH push data now(generally not used)

RST SYN FINconnection estab(setup teardown

commands)

bytes rcvr willingto accept

Internetchecksum

(as in UDP)

countingby bytes of data(not segments)

3 Transport Layer 82Comp 361 Spring 2005

TCP Flow control how it works

(Suppose TCP receiver discards out-of-order segments)spare room in buffer

= RcvWindow= RcvBuffer-[LastByteRcvd -

LastByteRead]

Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

guarantees receive buffer doesnrsquot overflow

3 Transport Layer 83Comp 361 Spring 2005

Technical Issue

Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

3 Transport Layer 84Comp 361 Spring 2005

Note on UDP

UDP has no flow control

UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

3 Transport Layer 85Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 86Comp 361 Spring 2005

TCP Connection Management

Three way handshakeStep 1 client end system sends

TCP SYN control segment to server

specifies client_isn the initial seq No application data

Step 2 server end system receives SYN replies with SYNACK control segment

ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

seq sbuffers flow control info (eg RcvWindow)

client connection initiatorSocket clientSocket = new Socket(hostnameport number)

server contacted by clientSocket connectionSocket = welcomeSocketaccept()

3 Transport Layer 87Comp 361 Spring 2005

TCP Connection Management (cont)

Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

Allocate buffersAllocates buffersCan include application data

SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

clientConnection request (SYN=1 seq=client_isn)

server

Connection granted (SYN=1 server_isn

ACK (SYN=0 seq=client_isn+1)

ack=client_isn+1)

ack=server_isn+1

3 Transport Layer 88Comp 361 Spring 2005

TCP Connection Management (cont)

Closing a connection

client closes socketclientSocketclose()

Step 1 client end system sends TCP FIN control segment to server

Step 2 server receives FIN replies with ACK Closes connection sends FIN

client

FIN

server

ACK

ACK

FIN

close

close

closed

tim

ed w

ait

3 Transport Layer 89Comp 361 Spring 2005

TCP Connection Management (cont)

Step 3 client receives FIN replies with ACK

Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

Closes down after timed-wait

Step 4 server receives ACK Connection closed

Note with small modification can handle simultaneous FINs

client

FIN

server

ACK

ACK

FIN

closing

closing

closed

tim

ed w

ait

closed

3 Transport Layer 90Comp 361 Spring 2005

TCP Connection Management (cont)

ExampleTCP serverlifecycle

Example TCP clientlifecycle

3 Transport Layer 91Comp 361 Spring 2005

A few special cases

Have not discussed what happens if both client and server decide to close down connection at same time

It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

3 Transport Layer 92Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 93Comp 361 Spring 2005

Principles of Congestion Control

Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

lost packets (buffer overflow at routers)long delays (queuing in router buffers)

a top-10 problem

3 Transport Layer 94Comp 361 Spring 2005

Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

large delays when congestedmaximum achievable throughput

3 Transport Layer 95Comp 361 Spring 2005

Causescosts of congestion scenario 2

one router finite buffers sender retransmission of lost packet

3 Transport Layer 96Comp 361 Spring 2005

(a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

(c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

λin λout=

λin λoutgtλ

inλout

ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

(c)(a) (b)

3 Transport Layer 97Comp 361 Spring 2005

Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

λin

Q what happens as and increase λ

in

3 Transport Layer 98Comp 361 Spring 2005

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

3 Transport Layer 99Comp 361 Spring 2005

Approaches towards congestion control

Two broad approaches towards congestion control

End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

Network-assisted congestion controlrouters provide feedback to end systems

single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

3 Transport Layer 100Comp 361 Spring 2005

Case study ATM ABR congestion control

RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

RM cells returned to sender by receiver with bits intact

small exception ndash see next page

ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

sender should use available bandwidth

if senderrsquos path congested sender throttled to minimum guaranteed rate

3 Transport Layer 101Comp 361 Spring 2005

Case study ATM ABR congestion control

two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

3 Transport Layer 102Comp 361 Spring 2005

Chapter 3 outline

31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

35 Connection-oriented transport TCP

segment structurereliable data transferflow controlconnection management

36 Principles of congestion control37 TCP congestion control

3 Transport Layer 103Comp 361 Spring 2005

TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

Congwin

w segments each with MSS bytes sent in one RTT

throughput = w MSSRTT Bytessec

3 Transport Layer 104Comp 361 Spring 2005

To simplify presentation we assume that RcvBufferis large enough that it will not overflow

Tools are ldquosimilarrdquo to flow control sender limits transmission using

LastByteSent-LastByteAcked le CongWin

How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

3 Transport Layer 105Comp 361 Spring 2005

TCP AIMDmultiplicative decrease additive increase increase

CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

cut CongWin in half after loss event

8 Kbytes

16 Kbytes

24 Kbytes

time

congestionwindow

Long-lived TCP connection

3 Transport Layer 106Comp 361 Spring 2005

TCP Slow Start

When connection begins CongWin = 1 MSS

Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

available bandwidth may be gtgt MSSRTT

desirable to quickly ramp up to respectable rate

When connection begins increase rate exponentially fast until first loss event

3 Transport Layer 107Comp 361 Spring 2005

TCP Slow Start (more)

When connection begins increase rate exponentially until first loss event

double CongWin every RTTdone by incrementing CongWin for every ACK received

Summary initial rate is slow but ramps up exponentially fast

Host A

one segment

RTT

Host B

time

two segments

four segments

3 Transport Layer 108Comp 361 Spring 2005

So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

3 Transport Layer 109Comp 361 Spring 2005

Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

3 Transport Layer 110Comp 361 Spring 2005

Summary TCP Congestion Control

When CongWin is below Threshold sender in slow-start phase window grows exponentially

When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

3 Transport Layer 111Comp 361 Spring 2005

The Big Picture

3 Transport Layer 112Comp 361 Spring 2005

TCP sender congestion controlEvent State TCP Sender Action Commentary

ACK receipt for previously unackeddata

Slow Start (SS)

CongWin = CongWin + MSS If (CongWin gt Threshold)

set state to ldquoCongestion Avoidancerdquo

Resulting in a doubling of CongWin every RTT

ACK receipt for previously unackeddata

CongestionAvoidance (CA)

CongWin = CongWin+MSS (MSSCongWin)

Additive increase resulting in increase of CongWin by 1 MSS every RTT

Loss event detected by triple duplicate ACK

SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

Enter slow start

Duplicate ACK

SS or CA Increment duplicate ACK count for segment being acked

CongWin and Threshold not changed

3 Transport Layer 113Comp 361 Spring 2005

TCP throughput

Whatrsquos the average throughput of TCP as a function of window size and RTT

Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

3 Transport Layer 114Comp 361 Spring 2005

TCP Futures

Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

L = 210-10 WowNew versions of TCP for high-speed needed

LRTTMSSsdot221

3 Transport Layer 115Comp 361 Spring 2005

TCP FairnessFairness goal if K TCP sessions share same

bottleneck link of bandwidth R each should have average rate of RK

TCP connection 1

bottleneckrouter

capacity R

TCP connection 2

3 Transport Layer 116Comp 361 Spring 2005

Why is TCP fairTwo competing sessions

Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Conn

ecti

on 2

thr

ough

p ut

congestion avoidance additive increaseloss decrease window by factor of 2

congestion avoidance additive increaseloss decrease window by factor of 2

3 Transport Layer 117Comp 361 Spring 2005

Fairness (more)Fairness and UDP

Multimedia apps often do not use TCP

do not want rate throttled by congestion control

Instead use UDPpump audiovideo at constant rate tolerate packet loss

Current Research area How to keep UDP from congesting the internet

Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

3 Transport Layer 118Comp 361 Spring 2005

TCP Latency ModelingNotation assumptions

Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

modeling slow start

Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

3 Transport Layer 119Comp 361 Spring 2005

Fixed Congestion Window (W)Two cases

1 WSR gt RTT + SR ACK for first segment in window returns before

windowrsquos worth of data sentLatency = 2RTT + OR

2 WSR lt RTT + SR ACK for first segment in window returns after

windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

3 Transport Layer 120Comp 361 Spring 2005

Fixed congestion window (1)

First caseWSR gt RTT + SR ACK for

first segment in window returns before windowrsquos worth of data sent

latency = 2RTT + OR

3 Transport Layer 121Comp 361 Spring 2005

Fixed congestion window (2)

Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

3 Transport Layer 122Comp 361 Spring 2005

TCP Latency Modeling Slow Start (1)

Now suppose window grows according to slow start(with no threshold and no loss events)

Will show that the delay for one object is

RS

RSRTTP

RORTTLatency P )12(2 minusminus⎥⎦

⎤⎢⎣⎡ +++=

where P is the number of times TCP idles at server1min minus= KQP

- where Q is the number of times the server idlesif the object were of infinite size

- and K is the number of windows that cover the object

3 Transport Layer 123Comp 361 Spring 2005

TCP Latency Modeling Slow Start (2)

RTT

initiate TCPconnection

requestobject

first window= SR

second window= 2SR

third window= 4SR

fourth window= 8SR

completetransmissionobject

delivered

time atclient

time atserver

Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

Server idles P=2 times

Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

Server idles P = minK-1Q times

3 Transport Layer 124Comp 361 Spring 2005

TCP Latency Modeling (3)

ementacknowledg receivesserver until

segment send tostartsserver whenfrom time=+ RTTRS

RS

RSRTTPRTT

RO

RSRTT

RSRTT

RO

idleTimeRTTRO

P

kP

k

P

pp

)12(][2

]2[2

2delay

1

1

1

minusminus+++=

minus+++=

++=

minus

=

=

sum

sum

th window after the timeidle 2 1 kRSRTT

RS k =⎥⎦

⎤⎢⎣⎡ minus+

+minus

window kth the transmit totime2 1 =minus

RSk

RTT

initiate TCPconnection

requestobject

first window= SR

second window= 2SR

third window= 4SR

fourth window= 8SR

completetransmissionobject

delivered

time atclient

time atserver

3 Transport Layer 125Comp 361 Spring 2005

TCP Latency Modeling (4)Recall K = number of windows that cover object

How do we calculate K

⎥⎥⎤

⎢⎢⎡ +=

+ge=

geminus=

ge+++=

ge+++=minus

minus

)1(log

)1(logmin

12min

222min222min

2

2

110

110

SO

SOkk

SOk

SOkOSSSkK

k

k

k

L

L

Calculation of Q number of idles for infinite-size objectis similar

3 Transport Layer 126Comp 361 Spring 2005

HTTP ModelingAssume Web page consists of

1 base HTML page (of size O bits)M images (each of size O bits)

Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

3 Transport Layer 127Comp 361 Spring 2005

HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

02468

101214161820

28Kbps

100Kbps

1 Mbps 10Mbps

non-persistent

persistent

parallel non-persistent

For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

3 Transport Layer 128Comp 361 Spring 2005

HTTP Response time (in seconds)

0

10

20

30

40

50

60

70

28Kbps

100Kbps

1 Mbps 10Mbps

non-persistent

persistent

parallel non-persistent

RTT =1 sec O = 5 Kbytes M=10 and X=5

For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

3 Transport Layer 129Comp 361 Spring 2005

Chapter 3 Summaryprinciples behind transport layer services

multiplexing demultiplexingreliable data transferflow controlcongestion control

instantiation and implementation in the Internet

UDPTCP

Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

  • Chapter 3 Transport Layer last revised 160305
  • Chapter 3 outline
  • Transport services and protocols
  • Transport vs network layer
  • Transport-layer protocols
  • Chapter 3 outline
  • Multiplexingdemultiplexing
  • Multiplexingdemultiplexing
  • How demultiplexing works
  • Connectionless demultiplexing
  • Connectionless demux (cont)
  • Connection-oriented demux
  • Connection-oriented demux (cont)
  • Connection-oriented demux Threaded Web Server
  • Chapter 3 outline
  • UDP User Datagram Protocol [RFC 768]
  • UDP more
  • UDP checksum
  • Chapter 3 outline
  • Principles of Reliable data transfer
  • Reliable data transfer getting started
  • Reliable data transfer getting started
  • Incremental Improvements
  • Rdt10 reliable transfer over a reliable channel
  • Rdt20 channel with bit errors
  • rdt20 FSM specification
  • rdt20 operation with no errors
  • rdt20 error scenario
  • rdt20 has a fatal flaw
  • rdt21 sender handles garbled ACKNAKs
  • rdt21 receiver handles garbled ACKNAKs
  • rdt21 discussion
  • rdt22 a NAK-free protocol
  • rdt22 sender receiver fragments
  • rdt30 channels with errors and loss
  • rdt30 sender
  • rdt30 in action
  • rdt30 in action
  • Performance of rdt30
  • rdt30 stop-and-wait operation
  • Pipelined protocols
  • Pipelined protocols
  • Pipelining increased utilization
  • Go-Back-N
  • GBN Sender
  • GBN sender extended FSM
  • GBN receiver extended FSM
  • More on receiver
  • GBN inaction
  • Selective Repeat
  • Selective repeat sender receiver windows
  • Selective repeat
  • Selective repeat in action
  • Selective repeat dilemma
  • Chapter 3 outline
  • TCP Overview RFCs 793 1122 1323 2018 2581
  • More TCP Details
  • Even More TCP Details
  • TCP segment structure
  • TCP seq rsquos and ACKs
  • TCP Round Trip Time and Timeout
  • TCP Round Trip Time and Timeout
  • Example RTT estimation
  • TCP Round Trip Time and Timeout
  • Chapter 3 outline
  • TCP reliable data transfer
  • TCP sender events
  • TCP sender(simplified)
  • TCP retransmission scenarios
  • TCP retransmission scenarios (more)
  • TCP ACK generation [RFC 1122 RFC 2581]
  • More on Sender Policies
  • Fast Retransmit
  • Fast retransmit algorithm
  • TCP GBN or Selective Repeat
  • Chapter 3 outline
  • TCP Flow Control
  • TCP Flow Control
  • TCP segment structure
  • TCP Flow control how it works
  • Technical Issue
  • Chapter 3 outline
  • TCP Connection Management
  • TCP Connection Management (cont)
  • TCP Connection Management (cont)
  • TCP Connection Management (cont)
  • TCP Connection Management (cont)
  • A few special cases
  • Chapter 3 outline
  • Principles of Congestion Control
  • Causescosts of congestion scenario 1
  • Causescosts of congestion scenario 2
  • Causescosts of congestion scenario 3
  • Causescosts of congestion scenario 3
  • Approaches towards congestion control
  • Case study ATM ABR congestion control
  • Case study ATM ABR congestion control
  • Chapter 3 outline
  • TCP Congestion Control
  • TCP AIMD
  • TCP Slow Start
  • TCP Slow Start (more)
  • Summary TCP Congestion Control
  • The Big Picture
  • TCP sender congestion control
  • TCP throughput
  • TCP Futures
  • TCP Fairness
  • Why is TCP fair
  • Fairness (more)
  • TCP Latency Modeling
  • Fixed Congestion Window (W)
  • Fixed congestion window (1)
  • Fixed congestion window (2)
  • TCP Latency Modeling Slow Start (1)
  • TCP Latency Modeling Slow Start (2)
  • TCP Latency Modeling (3)
  • TCP Latency Modeling (4)
  • HTTP Modeling
  • Chapter 3 Summary

    3 Transport Layer 2Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 3Comp 361 Spring 2005

    Transport services and protocolsprovide logical communicationbetween app processes running on different hoststransport protocols run in end systems

    send side breaks app messages into segments passes to network layerrcv side reassembles segments into messages passes to app layer

    more than one transport protocol available to apps

    Internet TCP and UDP

    applicationtransportnetworkdata linkphysical

    applicationtransportnetworkdata linkphysical

    networkdata linkphysical

    networkdata linkphysical

    networkdata linkphysical

    networkdata linkphysicalnetwork

    data linkphysical

    logical end-end transport

    3 Transport Layer 4Comp 361 Spring 2005

    Transport vs network layerHousehold analogy12 kids sending letters

    to 12 kidsprocesses = kidsapp messages = letters in envelopeshosts = housestransport protocol = Ann and Billnetwork-layer protocol = postal service

    network layer logical communication between hoststransport layer logical communication between processes

    relies on enhances network layer services

    3 Transport Layer 5Comp 361 Spring 2005

    Transport-layer protocols

    Internet transport servicesreliable in-order unicastdelivery (TCP)

    congestion flow controlconnection setup

    unreliable (ldquobest-effortrdquo) unordered unicast or multicast delivery UDPservices not available

    real-timebandwidth guaranteesreliable multicast

    applicationtransportnetworkdata linkphysical

    applicationtransportnetworkdata linkphysical

    networkdata linkphysical

    networkdata linkphysical

    networkdata linkphysical

    networkdata linkphysicalnetwork

    data linkphysical

    logical end-end transport

    3 Transport Layer 6Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 7Comp 361 Spring 2005

    Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

    Multiplexing at send host

    delivering received segmentsto correct socket

    Demultiplexing at rcv host

    = socket = process

    application

    transport

    network

    link

    physical

    P1 application

    transport

    network

    link

    physical

    application

    transport

    network

    link

    physical

    P2P3 P4P1

    host 1 host 2 host 3

    3 Transport Layer 8Comp 361 Spring 2005

    Multiplexingdemultiplexingsegment - unit of data

    exchanged between transport layer entities

    aka TPDU transport protocol data unit

    Demultiplexing delivering received segments to correct app layer processes

    receiver

    applicationtransportnetwork

    M P2applicationtransportnetwork

    HtHn segment

    segment Mapplicationtransportnetwork

    P1M

    M MP3 P4

    segmentheader

    application-layerdata

    3 Transport Layer 9Comp 361 Spring 2005

    How demultiplexing workshost receives IP datagrams

    each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

    host uses IP addresses amp port numbers to direct segment to appropriate socket

    source port dest port

    32 bits

    applicationdata

    (message)

    other header fields

    TCPUDP segment format

    3 Transport Layer 10Comp 361 Spring 2005

    Connectionless demultiplexingWhen host receives UDP segment

    checks destination port number in segmentdirects UDP segment to socket with that port number

    IP datagrams with different source IP addresses andor source port numbers directed to same socket

    Create sockets with port numbers

    DatagramSocket mySocket1 = new DatagramSocket(99111)

    DatagramSocket mySocket2 = new DatagramSocket(99222)

    UDP socket identified by two-tuple

    (dest IP address dest port number)

    3 Transport Layer 11Comp 361 Spring 2005

    Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

    ClientIPB

    P3

    clientIP A

    P1P1P3

    serverIP C

    SP 6428DP 9157

    SP 9157DP 6428

    SP 6428DP 5775

    SP 5775DP 6428

    SP provides ldquoreturn addressrdquo

    3 Transport Layer 12Comp 361 Spring 2005

    Connection-oriented demux

    TCP socket identified by 4-tuple

    source IP addresssource port numberdest IP addressdest port number

    recv host uses all four values to direct segment to appropriate socket

    Server host may support many simultaneous TCP sockets

    each socket identified by its own 4-tuple

    Web servers have different sockets for each connecting client

    non-persistent HTTP will have different socket for each request

    3 Transport Layer 13Comp 361 Spring 2005

    Connection-oriented demux(cont)

    ClientIPB

    P3

    clientIP A

    P1P1P3

    serverIP C

    SP 80DP 9157

    SP 9157DP 80

    SP 80DP 5775

    SP 5775DP 80

    P4

    3 Transport Layer 14Comp 361 Spring 2005

    Connection-oriented demux Threaded Web Server

    ClientIPB

    P1

    clientIP A

    P1P2

    serverIP C

    SP 9157DP 80

    SP 9157DP 80

    P4 P3

    D-IPCS-IP AD-IPC

    S-IP B

    SP 5775DP 80

    D-IPCS-IP B

    3 Transport Layer 15Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 16Comp 361 Spring 2005

    UDP User Datagram Protocol [RFC 768]

    ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

    lostdelivered out of order to app

    connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

    Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

    3 Transport Layer 17Comp 361 Spring 2005

    UDP moreoften used for streaming multimedia apps

    loss tolerantrate sensitive

    other UDP uses (why)

    DNS small delaySNMP stressful cond

    reliable transfer over UDP add reliability at application layer

    application-specific error recover

    source port dest port

    32 bits

    Applicationdata

    (message)

    length checksumLength in

    bytes of UDPsegmentincluding

    header

    UDP segment format

    3 Transport Layer 18Comp 361 Spring 2005

    UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

    segment

    Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

    NO - error detectedYES - no error detected But maybe errors nonetheless More later

    Receiver may choose to discard segment or send a warning to app in case error

    Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

    3 Transport Layer 19Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 20Comp 361 Spring 2005

    Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

    characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

    3 Transport Layer 21Comp 361 Spring 2005

    Reliable data transfer getting started

    sendside

    receiveside

    rdt_send() called from above (eg by app) Passed data to

    deliver to receiver upper layer

    udt_send() called by rdtto transfer packet over

    unreliable channel to receiver

    rdt_rcv() called when packet arrives on rcv-side of channel

    deliver_data() called by rdt to deliver data to upper

    3 Transport Layer 22Comp 361 Spring 2005

    Reliable data transfer getting startedWersquoll

    incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

    but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

    state1

    state2

    event causing state transitionactions taken on state transition

    state when in this ldquostaterdquo next state

    uniquely determined by next event

    eventactions

    3 Transport Layer 23Comp 361 Spring 2005

    Incremental Improvements

    rdt10 assumes every packet sent arrives and no errors introduced in transmission

    rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

    rdt21 deals with corrupted ACKSNAKS

    rdt22 like rdt21 but does not need NAKs

    Rdt30 Allows packets to be lost

    Rdt10 reliable transfer over a reliable channel

    underlying channel perfectly reliableno bit errorsno loss of packets

    separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

    Wait for call from above packet = make_pkt(data)

    udt_send(packet)

    rdt_send(data)extract (packetdata)deliver_data(data)

    Wait for call from

    below

    rdt_rcv(packet)

    sender receiver

    3 Transport Layer 24Comp 361 Spring 2005

    3 Transport Layer 25Comp 361 Spring 2005

    Rdt20 channel with bit errors

    underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

    the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

    new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

    3 Transport Layer 26Comp 361 Spring 2005

    rdt20 FSM specification

    Wait for call from above

    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

    udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

    udt_send(NAK)

    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

    Wait for ACK or

    NAK

    rdt_send(data)

    receiver

    Wait for call from

    below

    Λ

    sender

    3 Transport Layer 27Comp 361 Spring 2005

    rdt20 operation with no errors

    Wait for call from above

    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

    udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

    udt_send(NAK)

    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

    Wait for ACK or

    NAK

    Wait for call from

    below

    rdt_send(data)

    Λ

    3 Transport Layer 28Comp 361 Spring 2005

    rdt20 error scenario

    Wait for call from above

    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

    udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

    udt_send(NAK)

    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

    Wait for ACK or

    NAK

    Wait for call from

    below

    rdt_send(data)

    Λ

    3 Transport Layer 29Comp 361 Spring 2005

    rdt20 has a fatal flawWhat happens if ACKNAK

    corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

    What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

    Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

    Sender sends one packet then waits for receiver response

    stop and wait

    3 Transport Layer 30Comp 361 Spring 2005

    Sender whenever sender receives control message it sends a packet to receiver

    A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

    Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

    Note ACKNAK do not contain sequence

    3 Transport Layer 31Comp 361 Spring 2005

    rdt21 sender handles garbled ACKNAKs

    Wait for call 0 from

    above

    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

    rdt_send(data)

    Wait for ACK or NAK 0 udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

    rdt_send(data)

    udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

    Wait forcall 1 from

    above

    Wait for ACK or NAK 1

    ΛΛ

    3 Transport Layer 32Comp 361 Spring 2005

    rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

    ampamp has_seq0(rcvpkt)

    Wait for 0 from below

    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

    Wait for 1 from below

    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

    3 Transport Layer 33Comp 361 Spring 2005

    rdt21 discussion

    Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

    state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

    Receivermust check if received packet is duplicate

    state indicates whether 0 or 1 is expected pkt seq

    note receiver can notknow if its last ACKNAK received OK at sender

    3 Transport Layer 34Comp 361 Spring 2005

    rdt22 a NAK-free protocol

    same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

    receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

    duplicate ACK at sender results in same action as NAK retransmit current pkt

    3 Transport Layer 35Comp 361 Spring 2005

    rdt22 sender receiver fragments

    Wait for call 0 from

    above

    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

    rdt_send(data)

    udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

    isACK(rcvpkt1) )

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

    Wait for ACK

    0sender FSM

    fragment

    Wait for 0 from below

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

    has_seq1(rcvpkt))

    udt_send(sndpkt)receiver FSM

    fragment

    Λ

    3 Transport Layer 36Comp 361 Spring 2005

    rdt30 channels with errors and loss

    New assumptionunderlying channel can also lose packets (data or ACKs)

    checksum seq ACKs retransmissions will be of help but not enough

    Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

    Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

    retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

    requires countdown timer

    3 Transport Layer 37Comp 361 Spring 2005

    rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

    rdt_send(data)

    Wait for

    ACK0

    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

    Wait for call 1 from

    above

    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

    rdt_send(data)

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

    stop_timerstop_timer

    udt_send(sndpkt)start_timer

    timeout

    udt_send(sndpkt)start_timer

    timeout

    rdt_rcv(rcvpkt)

    Wait for call 0from

    above

    Wait for

    ACK1

    Λrdt_rcv(rcvpkt)

    ΛΛ

    Λ

    3 Transport Layer 38Comp 361 Spring 2005

    rdt30 in action

    3 Transport Layer 39Comp 361 Spring 2005

    rdt30 in action

    3 Transport Layer 40Comp 361 Spring 2005

    Performance of rdt30

    rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

    L (packet length in bits)R (transmission rate bps)

    8kbpkt109 bsec

    Ttransmit = = = 8 microsec

    U sender =

    00830008

    = 000027 L R RTT + L R

    =

    U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

    rdt30 stop-and-wait operation

    first packet bit transmitted t = 0

    sender receiver

    RTT

    last packet bit transmitted t = L R

    first packet bit arriveslast packet bit arrives send ACK

    ACK arrives send next packet t = RTT + L R

    U sender =

    008 30008

    = 000027 L R RTT + L R

    =

    3 Transport Layer 41Comp 361 Spring 2005

    3 Transport Layer 42Comp 361 Spring 2005

    Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

    range of sequence numbers must be increasedbuffering at sender andor receiver

    3 Transport Layer 43Comp 361 Spring 2005

    Pipelined protocols

    Advantage much better bandwidth utilization than stop-and-wait

    Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

    Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

    Note TCP is not exactly either

    Pipelining increased utilization

    first packet bit transmitted t = 0

    sender receiver

    RTT

    last bit transmitted t = L R

    first packet bit arriveslast packet bit arrives send ACK

    ACK arrives send next packet t = RTT + L R

    last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

    U sender =

    02430008

    = 00008 3 L R RTT + L R

    =

    Increase utilizationby a factor of 3

    3 Transport Layer 44Comp 361 Spring 2005

    3 Transport Layer 45Comp 361 Spring 2005

    Go-Back-NSender

    k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

    ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

    Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

    3 Transport Layer 46Comp 361 Spring 2005

    GBN Sender

    rdt_Send() called checks to see if window is full No send out packetYes return data to application level

    Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

    Timeout resends ALL packets that have been sent but not yet acknowledged

    This is only event that triggers resend

    3 Transport Layer 47Comp 361 Spring 2005

    GBN sender extended FSMrdt_send(data)

    Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

    timeout

    if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

    start_timernextseqnum++

    elserefuse_data(data)

    base = getacknum(rcvpkt)+1If (base == nextseqnum)

    stop_timerelse

    start_timer

    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

    base=1nextseqnum=1

    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

    Λ

    3 Transport Layer 48Comp 361 Spring 2005

    GBN receiver extended FSM

    Wait

    udt_send(sndpkt)default

    rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

    expectedseqnum=1sndpkt =

    make_pkt(0ACKchksum)

    Λ

    If expected packet receivedSend ACK and deliver packet upstairs

    If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

    3 Transport Layer 49Comp 361 Spring 2005

    More on receiver

    The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

    3 Transport Layer 50Comp 361 Spring 2005

    GBN inaction

    GBN is easy to code but might have performance problems

    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

    3 Transport Layer 51Comp 361 Spring 2005

    3 Transport Layer 52Comp 361 Spring 2005

    Selective Repeat

    receiver individually acknowledges all correctly received pkts

    buffers pkts as needed for eventual in-order delivery to upper layer

    sender only resends pkts for which ACK not received

    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

    3 Transport Layer 53Comp 361 Spring 2005

    Selective repeat sender receiver windows

    3 Transport Layer 54Comp 361 Spring 2005

    Selective repeat

    pkt n in [rcvbase rcvbase+N-1]

    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

    pkt n in [rcvbase-Nrcvbase-1]

    ACK(n) (note this is a reACK)

    otherwiseignore

    receiverdata from above

    if next available seq in window send pkt

    timeout(n)resend pkt n restart timer

    ACK(n) in [sendbasesendbase+N]

    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

    sender

    3 Transport Layer 55Comp 361 Spring 2005

    Selective repeat in action

    3 Transport Layer 56Comp 361 Spring 2005

    Selective repeatdilemma

    Example seq rsquos 0 1 2 3window size=3

    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

    Q what is relationship between seq size and window size

    3 Transport Layer 57Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 58Comp 361 Spring 2005

    TCP Overview RFCs 793 1122 1323 2018 2581

    full duplex databi-directional data flow in same connectionMSS maximum segment size

    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

    flow controlledsender will not overwhelm receiver

    point-to-pointone sender one receiver

    reliable in-order byte steam

    no ldquomessage boundariesrdquopipelined

    TCP congestion and flow control set window size

    send amp receive buffers

    socketdoor

    TCPsend buffer

    TCPreceive buffer

    socketdoor

    segment

    applicationwrites data

    applicationreads data

    3 Transport Layer 59Comp 361 Spring 2005

    More TCP DetailsMaximum Segment Size (MSS)

    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

    Application Data + TCP Header = TCP Segment

    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

    (again no payload)Client responds with third special segment

    This can contain payload

    3 Transport Layer 60Comp 361 Spring 2005

    Even More TCP Details

    A TCP connection between client and server creates in both client and server

    (i) buffers(ii) variables and

    (iii) a socket connection to process

    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

    any of the network elements between the host and server

    3 Transport Layer 61Comp 361 Spring 2005

    TCP segment structure

    source port dest port

    32 bits

    applicationdata

    (variable length)

    sequence numberacknowledgement number

    Receive windowUrg data pnterchecksum

    FSRPAUheadlen

    notused

    Options (variable length)

    URG urgent data (generally not used)

    ACK ACK valid

    PSH push data now(generally not used)

    RST SYN FINconnection estab(setup teardown

    commands)

    bytes rcvr willingto accept

    Internetchecksum

    (as in UDP)

    countingby bytes of data(not segments)

    3 Transport Layer 62Comp 361 Spring 2005

    TCP seq rsquos and ACKsSeq rsquos

    byte stream ldquonumberrdquo of first byte in segmentrsquos data

    ACKsseq of next byte expected from other sidecumulative ACK

    Q how receiver handles out-of-order segments

    A TCP spec doesnrsquot say - up to implementer

    Host BHost A

    Seq=42 ACK=79 data = lsquoCrsquo

    Seq=79 ACK=43 data = lsquoCrsquo

    Seq=43 ACK=80

    Usertypes

    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

    back lsquoCrsquo

    host ACKsreceipt

    of echoedlsquoCrsquo

    timesimple telnet scenario

    3 Transport Layer 63Comp 361 Spring 2005

    TCP Round Trip Time and Timeout

    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

    average several recent measurements not just current SampleRTT

    Q how to set TCP timeout valuelonger than RTT

    but RTT variestoo short premature timeout

    unnecessary retransmissions

    too long slow reaction to segment loss

    3 Transport Layer 64Comp 361 Spring 2005

    TCP Round Trip Time and Timeout

    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

    3 Transport Layer 65Comp 361 Spring 2005

    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

    100

    150

    200

    250

    300

    350

    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

    time (seconnds)

    RTT

    (mill

    iseco

    nds)

    SampleRTT Estimated RTT

    3 Transport Layer 66Comp 361 Spring 2005

    TCP Round Trip Time and Timeout

    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

    (typically β = 025)

    Then set timeout interval

    TimeoutInterval = EstimatedRTT + 4DevRTT

    3 Transport Layer 67Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 68Comp 361 Spring 2005

    TCP reliable data transfer

    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

    Retransmissions are triggered by

    timeout eventsduplicate acks

    Initially consider simplified TCP sender

    ignore duplicate acksignore flow control congestion control

    3 Transport Layer 69Comp 361 Spring 2005

    TCP sender eventsdata rcvd from app

    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

    timeoutretransmit segment that caused timeoutrestart timer

    Ack rcvdIf acknowledges previously unackedsegments

    update what is known to be ackedstart timer if there are outstanding segments

    TCP sender(simplified)

    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

    loop (forever) switch(event)

    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

    event timer timeoutretransmit not-yet-acknowledged segment with

    smallest sequence numberstart timer

    event ACK received with ACK field value of y if (y gt SendBase)

    SendBase = yif (there are currently not-yet-acknowledged segments)

    start timer

    end of loop forever

    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

    3 Transport Layer 70Comp 361 Spring 2005

    3 Transport Layer 71Comp 361 Spring 2005

    TCP retransmission scenariosHost A

    Seq=100 20 bytes data

    ACK=100

    timepremature timeout

    Host B

    Seq=92 8 bytes data

    ACK=120

    Seq=92 8 bytes data

    Seq=

    92 t

    imeo

    ut

    ACK=120

    Host A

    Seq=92 8 bytes data

    ACK=100

    loss

    tim

    eout

    lost ACK scenario

    Host B

    X

    Seq=92 8 bytes data

    ACK=100

    time

    SendBase= 120

    SendBase= 120

    Sendbase= 100

    Seq=

    92 t

    imeo

    utSendBase

    = 100

    3 Transport Layer 72Comp 361 Spring 2005

    TCP retransmission scenarios (more)Host A

    Seq=92 8 bytes data

    ACK=100

    loss

    tim

    eout

    Cumulative ACK scenario

    Host B

    X

    Seq=100 20 bytes data

    ACK=120

    time

    SendBase= 120

    3 Transport Layer 73Comp 361 Spring 2005

    TCP ACK generation [RFC 1122 RFC 2581]

    Event at Receiver

    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

    Arrival of in-order segment withexpected seq One other segment has ACK pending

    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

    Arrival of segment that partially or completely fills gap

    TCP Receiver action

    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

    Immediately send single cumulative ACK ACKing both in-order segments

    Immediately send duplicate ACK indicating seq of next expected byte

    Immediate send ACK provided thatsegment starts at lower end of gap

    3 Transport Layer 74Comp 361 Spring 2005

    More on Sender Policies

    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

    3 Transport Layer 75Comp 361 Spring 2005

    Fast Retransmit

    Time-out period often relatively long

    long delay before resending lost packet

    Detect lost segments via duplicate ACKs

    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

    fast retransmit resend segment before timer expires

    3 Transport Layer 76Comp 361 Spring 2005

    Fast retransmit algorithm

    event ACK received with ACK field value of y if (y gt SendBase)

    SendBase = yif (there are currently not-yet-acknowledged segments)

    start timer

    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

    resend segment with sequence number y

    a duplicate ACK for already ACKed segment

    fast retransmit

    3 Transport Layer 77Comp 361 Spring 2005

    TCP GBN or Selective Repeat

    Basic TCP looks a lot like GBN

    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

    This looks a lot like Selective Repeat

    TCP is a hybrid

    3 Transport Layer 78Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 79Comp 361 Spring 2005

    TCP Flow Control

    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

    3 Transport Layer 80Comp 361 Spring 2005

    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

    transmitting too muchtoo fast

    flow controlreceive side of TCP connection has a receive buffer

    speed-matching service matching the send rate to the receiving apprsquos drain rate

    app process may be slow at reading from buffer

    3 Transport Layer 81Comp 361 Spring 2005

    TCP segment structure

    source port dest port

    32 bits

    applicationdata

    (variable length)

    sequence numberacknowledgement number

    Receive windowUrg data pnterchecksum

    FSRPAUheadlen

    notused

    Options (variable length)

    URG urgent data (generally not used)

    ACK ACK valid

    PSH push data now(generally not used)

    RST SYN FINconnection estab(setup teardown

    commands)

    bytes rcvr willingto accept

    Internetchecksum

    (as in UDP)

    countingby bytes of data(not segments)

    3 Transport Layer 82Comp 361 Spring 2005

    TCP Flow control how it works

    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

    = RcvWindow= RcvBuffer-[LastByteRcvd -

    LastByteRead]

    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

    guarantees receive buffer doesnrsquot overflow

    3 Transport Layer 83Comp 361 Spring 2005

    Technical Issue

    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

    3 Transport Layer 84Comp 361 Spring 2005

    Note on UDP

    UDP has no flow control

    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

    3 Transport Layer 85Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 86Comp 361 Spring 2005

    TCP Connection Management

    Three way handshakeStep 1 client end system sends

    TCP SYN control segment to server

    specifies client_isn the initial seq No application data

    Step 2 server end system receives SYN replies with SYNACK control segment

    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

    seq sbuffers flow control info (eg RcvWindow)

    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

    3 Transport Layer 87Comp 361 Spring 2005

    TCP Connection Management (cont)

    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

    Allocate buffersAllocates buffersCan include application data

    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

    clientConnection request (SYN=1 seq=client_isn)

    server

    Connection granted (SYN=1 server_isn

    ACK (SYN=0 seq=client_isn+1)

    ack=client_isn+1)

    ack=server_isn+1

    3 Transport Layer 88Comp 361 Spring 2005

    TCP Connection Management (cont)

    Closing a connection

    client closes socketclientSocketclose()

    Step 1 client end system sends TCP FIN control segment to server

    Step 2 server receives FIN replies with ACK Closes connection sends FIN

    client

    FIN

    server

    ACK

    ACK

    FIN

    close

    close

    closed

    tim

    ed w

    ait

    3 Transport Layer 89Comp 361 Spring 2005

    TCP Connection Management (cont)

    Step 3 client receives FIN replies with ACK

    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

    Closes down after timed-wait

    Step 4 server receives ACK Connection closed

    Note with small modification can handle simultaneous FINs

    client

    FIN

    server

    ACK

    ACK

    FIN

    closing

    closing

    closed

    tim

    ed w

    ait

    closed

    3 Transport Layer 90Comp 361 Spring 2005

    TCP Connection Management (cont)

    ExampleTCP serverlifecycle

    Example TCP clientlifecycle

    3 Transport Layer 91Comp 361 Spring 2005

    A few special cases

    Have not discussed what happens if both client and server decide to close down connection at same time

    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

    3 Transport Layer 92Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 93Comp 361 Spring 2005

    Principles of Congestion Control

    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

    a top-10 problem

    3 Transport Layer 94Comp 361 Spring 2005

    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

    large delays when congestedmaximum achievable throughput

    3 Transport Layer 95Comp 361 Spring 2005

    Causescosts of congestion scenario 2

    one router finite buffers sender retransmission of lost packet

    3 Transport Layer 96Comp 361 Spring 2005

    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

    λin λout=

    λin λoutgtλ

    inλout

    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

    (c)(a) (b)

    3 Transport Layer 97Comp 361 Spring 2005

    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

    λin

    Q what happens as and increase λ

    in

    3 Transport Layer 98Comp 361 Spring 2005

    Causescosts of congestion scenario 3

    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

    3 Transport Layer 99Comp 361 Spring 2005

    Approaches towards congestion control

    Two broad approaches towards congestion control

    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

    Network-assisted congestion controlrouters provide feedback to end systems

    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

    3 Transport Layer 100Comp 361 Spring 2005

    Case study ATM ABR congestion control

    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

    RM cells returned to sender by receiver with bits intact

    small exception ndash see next page

    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

    sender should use available bandwidth

    if senderrsquos path congested sender throttled to minimum guaranteed rate

    3 Transport Layer 101Comp 361 Spring 2005

    Case study ATM ABR congestion control

    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

    3 Transport Layer 102Comp 361 Spring 2005

    Chapter 3 outline

    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

    35 Connection-oriented transport TCP

    segment structurereliable data transferflow controlconnection management

    36 Principles of congestion control37 TCP congestion control

    3 Transport Layer 103Comp 361 Spring 2005

    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

    Congwin

    w segments each with MSS bytes sent in one RTT

    throughput = w MSSRTT Bytessec

    3 Transport Layer 104Comp 361 Spring 2005

    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

    Tools are ldquosimilarrdquo to flow control sender limits transmission using

    LastByteSent-LastByteAcked le CongWin

    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

    3 Transport Layer 105Comp 361 Spring 2005

    TCP AIMDmultiplicative decrease additive increase increase

    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

    cut CongWin in half after loss event

    8 Kbytes

    16 Kbytes

    24 Kbytes

    time

    congestionwindow

    Long-lived TCP connection

    3 Transport Layer 106Comp 361 Spring 2005

    TCP Slow Start

    When connection begins CongWin = 1 MSS

    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

    available bandwidth may be gtgt MSSRTT

    desirable to quickly ramp up to respectable rate

    When connection begins increase rate exponentially fast until first loss event

    3 Transport Layer 107Comp 361 Spring 2005

    TCP Slow Start (more)

    When connection begins increase rate exponentially until first loss event

    double CongWin every RTTdone by incrementing CongWin for every ACK received

    Summary initial rate is slow but ramps up exponentially fast

    Host A

    one segment

    RTT

    Host B

    time

    two segments

    four segments

    3 Transport Layer 108Comp 361 Spring 2005

    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

    3 Transport Layer 109Comp 361 Spring 2005

    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

    3 Transport Layer 110Comp 361 Spring 2005

    Summary TCP Congestion Control

    When CongWin is below Threshold sender in slow-start phase window grows exponentially

    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

    3 Transport Layer 111Comp 361 Spring 2005

    The Big Picture

    3 Transport Layer 112Comp 361 Spring 2005

    TCP sender congestion controlEvent State TCP Sender Action Commentary

    ACK receipt for previously unackeddata

    Slow Start (SS)

    CongWin = CongWin + MSS If (CongWin gt Threshold)

    set state to ldquoCongestion Avoidancerdquo

    Resulting in a doubling of CongWin every RTT

    ACK receipt for previously unackeddata

    CongestionAvoidance (CA)

    CongWin = CongWin+MSS (MSSCongWin)

    Additive increase resulting in increase of CongWin by 1 MSS every RTT

    Loss event detected by triple duplicate ACK

    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

    Enter slow start

    Duplicate ACK

    SS or CA Increment duplicate ACK count for segment being acked

    CongWin and Threshold not changed

    3 Transport Layer 113Comp 361 Spring 2005

    TCP throughput

    Whatrsquos the average throughput of TCP as a function of window size and RTT

    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

    3 Transport Layer 114Comp 361 Spring 2005

    TCP Futures

    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

    L = 210-10 WowNew versions of TCP for high-speed needed

    LRTTMSSsdot221

    3 Transport Layer 115Comp 361 Spring 2005

    TCP FairnessFairness goal if K TCP sessions share same

    bottleneck link of bandwidth R each should have average rate of RK

    TCP connection 1

    bottleneckrouter

    capacity R

    TCP connection 2

    3 Transport Layer 116Comp 361 Spring 2005

    Why is TCP fairTwo competing sessions

    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

    R

    R

    equal bandwidth share

    Connection 1 throughput

    Conn

    ecti

    on 2

    thr

    ough

    p ut

    congestion avoidance additive increaseloss decrease window by factor of 2

    congestion avoidance additive increaseloss decrease window by factor of 2

    3 Transport Layer 117Comp 361 Spring 2005

    Fairness (more)Fairness and UDP

    Multimedia apps often do not use TCP

    do not want rate throttled by congestion control

    Instead use UDPpump audiovideo at constant rate tolerate packet loss

    Current Research area How to keep UDP from congesting the internet

    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

    3 Transport Layer 118Comp 361 Spring 2005

    TCP Latency ModelingNotation assumptions

    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

    modeling slow start

    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

    3 Transport Layer 119Comp 361 Spring 2005

    Fixed Congestion Window (W)Two cases

    1 WSR gt RTT + SR ACK for first segment in window returns before

    windowrsquos worth of data sentLatency = 2RTT + OR

    2 WSR lt RTT + SR ACK for first segment in window returns after

    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

    3 Transport Layer 120Comp 361 Spring 2005

    Fixed congestion window (1)

    First caseWSR gt RTT + SR ACK for

    first segment in window returns before windowrsquos worth of data sent

    latency = 2RTT + OR

    3 Transport Layer 121Comp 361 Spring 2005

    Fixed congestion window (2)

    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

    3 Transport Layer 122Comp 361 Spring 2005

    TCP Latency Modeling Slow Start (1)

    Now suppose window grows according to slow start(with no threshold and no loss events)

    Will show that the delay for one object is

    RS

    RSRTTP

    RORTTLatency P )12(2 minusminus⎥⎦

    ⎤⎢⎣⎡ +++=

    where P is the number of times TCP idles at server1min minus= KQP

    - where Q is the number of times the server idlesif the object were of infinite size

    - and K is the number of windows that cover the object

    3 Transport Layer 123Comp 361 Spring 2005

    TCP Latency Modeling Slow Start (2)

    RTT

    initiate TCPconnection

    requestobject

    first window= SR

    second window= 2SR

    third window= 4SR

    fourth window= 8SR

    completetransmissionobject

    delivered

    time atclient

    time atserver

    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

    Server idles P=2 times

    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

    Server idles P = minK-1Q times

    3 Transport Layer 124Comp 361 Spring 2005

    TCP Latency Modeling (3)

    ementacknowledg receivesserver until

    segment send tostartsserver whenfrom time=+ RTTRS

    RS

    RSRTTPRTT

    RO

    RSRTT

    RSRTT

    RO

    idleTimeRTTRO

    P

    kP

    k

    P

    pp

    )12(][2

    ]2[2

    2delay

    1

    1

    1

    minusminus+++=

    minus+++=

    ++=

    minus

    =

    =

    sum

    sum

    th window after the timeidle 2 1 kRSRTT

    RS k =⎥⎦

    ⎤⎢⎣⎡ minus+

    +minus

    window kth the transmit totime2 1 =minus

    RSk

    RTT

    initiate TCPconnection

    requestobject

    first window= SR

    second window= 2SR

    third window= 4SR

    fourth window= 8SR

    completetransmissionobject

    delivered

    time atclient

    time atserver

    3 Transport Layer 125Comp 361 Spring 2005

    TCP Latency Modeling (4)Recall K = number of windows that cover object

    How do we calculate K

    ⎥⎥⎤

    ⎢⎢⎡ +=

    +ge=

    geminus=

    ge+++=

    ge+++=minus

    minus

    )1(log

    )1(logmin

    12min

    222min222min

    2

    2

    110

    110

    SO

    SOkk

    SOk

    SOkOSSSkK

    k

    k

    k

    L

    L

    Calculation of Q number of idles for infinite-size objectis similar

    3 Transport Layer 126Comp 361 Spring 2005

    HTTP ModelingAssume Web page consists of

    1 base HTML page (of size O bits)M images (each of size O bits)

    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

    3 Transport Layer 127Comp 361 Spring 2005

    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

    02468

    101214161820

    28Kbps

    100Kbps

    1 Mbps 10Mbps

    non-persistent

    persistent

    parallel non-persistent

    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

    3 Transport Layer 128Comp 361 Spring 2005

    HTTP Response time (in seconds)

    0

    10

    20

    30

    40

    50

    60

    70

    28Kbps

    100Kbps

    1 Mbps 10Mbps

    non-persistent

    persistent

    parallel non-persistent

    RTT =1 sec O = 5 Kbytes M=10 and X=5

    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

    3 Transport Layer 129Comp 361 Spring 2005

    Chapter 3 Summaryprinciples behind transport layer services

    multiplexing demultiplexingreliable data transferflow controlcongestion control

    instantiation and implementation in the Internet

    UDPTCP

    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

    • Chapter 3 Transport Layer last revised 160305
    • Chapter 3 outline
    • Transport services and protocols
    • Transport vs network layer
    • Transport-layer protocols
    • Chapter 3 outline
    • Multiplexingdemultiplexing
    • Multiplexingdemultiplexing
    • How demultiplexing works
    • Connectionless demultiplexing
    • Connectionless demux (cont)
    • Connection-oriented demux
    • Connection-oriented demux (cont)
    • Connection-oriented demux Threaded Web Server
    • Chapter 3 outline
    • UDP User Datagram Protocol [RFC 768]
    • UDP more
    • UDP checksum
    • Chapter 3 outline
    • Principles of Reliable data transfer
    • Reliable data transfer getting started
    • Reliable data transfer getting started
    • Incremental Improvements
    • Rdt10 reliable transfer over a reliable channel
    • Rdt20 channel with bit errors
    • rdt20 FSM specification
    • rdt20 operation with no errors
    • rdt20 error scenario
    • rdt20 has a fatal flaw
    • rdt21 sender handles garbled ACKNAKs
    • rdt21 receiver handles garbled ACKNAKs
    • rdt21 discussion
    • rdt22 a NAK-free protocol
    • rdt22 sender receiver fragments
    • rdt30 channels with errors and loss
    • rdt30 sender
    • rdt30 in action
    • rdt30 in action
    • Performance of rdt30
    • rdt30 stop-and-wait operation
    • Pipelined protocols
    • Pipelined protocols
    • Pipelining increased utilization
    • Go-Back-N
    • GBN Sender
    • GBN sender extended FSM
    • GBN receiver extended FSM
    • More on receiver
    • GBN inaction
    • Selective Repeat
    • Selective repeat sender receiver windows
    • Selective repeat
    • Selective repeat in action
    • Selective repeat dilemma
    • Chapter 3 outline
    • TCP Overview RFCs 793 1122 1323 2018 2581
    • More TCP Details
    • Even More TCP Details
    • TCP segment structure
    • TCP seq rsquos and ACKs
    • TCP Round Trip Time and Timeout
    • TCP Round Trip Time and Timeout
    • Example RTT estimation
    • TCP Round Trip Time and Timeout
    • Chapter 3 outline
    • TCP reliable data transfer
    • TCP sender events
    • TCP sender(simplified)
    • TCP retransmission scenarios
    • TCP retransmission scenarios (more)
    • TCP ACK generation [RFC 1122 RFC 2581]
    • More on Sender Policies
    • Fast Retransmit
    • Fast retransmit algorithm
    • TCP GBN or Selective Repeat
    • Chapter 3 outline
    • TCP Flow Control
    • TCP Flow Control
    • TCP segment structure
    • TCP Flow control how it works
    • Technical Issue
    • Chapter 3 outline
    • TCP Connection Management
    • TCP Connection Management (cont)
    • TCP Connection Management (cont)
    • TCP Connection Management (cont)
    • TCP Connection Management (cont)
    • A few special cases
    • Chapter 3 outline
    • Principles of Congestion Control
    • Causescosts of congestion scenario 1
    • Causescosts of congestion scenario 2
    • Causescosts of congestion scenario 3
    • Causescosts of congestion scenario 3
    • Approaches towards congestion control
    • Case study ATM ABR congestion control
    • Case study ATM ABR congestion control
    • Chapter 3 outline
    • TCP Congestion Control
    • TCP AIMD
    • TCP Slow Start
    • TCP Slow Start (more)
    • Summary TCP Congestion Control
    • The Big Picture
    • TCP sender congestion control
    • TCP throughput
    • TCP Futures
    • TCP Fairness
    • Why is TCP fair
    • Fairness (more)
    • TCP Latency Modeling
    • Fixed Congestion Window (W)
    • Fixed congestion window (1)
    • Fixed congestion window (2)
    • TCP Latency Modeling Slow Start (1)
    • TCP Latency Modeling Slow Start (2)
    • TCP Latency Modeling (3)
    • TCP Latency Modeling (4)
    • HTTP Modeling
    • Chapter 3 Summary

      3 Transport Layer 3Comp 361 Spring 2005

      Transport services and protocolsprovide logical communicationbetween app processes running on different hoststransport protocols run in end systems

      send side breaks app messages into segments passes to network layerrcv side reassembles segments into messages passes to app layer

      more than one transport protocol available to apps

      Internet TCP and UDP

      applicationtransportnetworkdata linkphysical

      applicationtransportnetworkdata linkphysical

      networkdata linkphysical

      networkdata linkphysical

      networkdata linkphysical

      networkdata linkphysicalnetwork

      data linkphysical

      logical end-end transport

      3 Transport Layer 4Comp 361 Spring 2005

      Transport vs network layerHousehold analogy12 kids sending letters

      to 12 kidsprocesses = kidsapp messages = letters in envelopeshosts = housestransport protocol = Ann and Billnetwork-layer protocol = postal service

      network layer logical communication between hoststransport layer logical communication between processes

      relies on enhances network layer services

      3 Transport Layer 5Comp 361 Spring 2005

      Transport-layer protocols

      Internet transport servicesreliable in-order unicastdelivery (TCP)

      congestion flow controlconnection setup

      unreliable (ldquobest-effortrdquo) unordered unicast or multicast delivery UDPservices not available

      real-timebandwidth guaranteesreliable multicast

      applicationtransportnetworkdata linkphysical

      applicationtransportnetworkdata linkphysical

      networkdata linkphysical

      networkdata linkphysical

      networkdata linkphysical

      networkdata linkphysicalnetwork

      data linkphysical

      logical end-end transport

      3 Transport Layer 6Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 7Comp 361 Spring 2005

      Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

      Multiplexing at send host

      delivering received segmentsto correct socket

      Demultiplexing at rcv host

      = socket = process

      application

      transport

      network

      link

      physical

      P1 application

      transport

      network

      link

      physical

      application

      transport

      network

      link

      physical

      P2P3 P4P1

      host 1 host 2 host 3

      3 Transport Layer 8Comp 361 Spring 2005

      Multiplexingdemultiplexingsegment - unit of data

      exchanged between transport layer entities

      aka TPDU transport protocol data unit

      Demultiplexing delivering received segments to correct app layer processes

      receiver

      applicationtransportnetwork

      M P2applicationtransportnetwork

      HtHn segment

      segment Mapplicationtransportnetwork

      P1M

      M MP3 P4

      segmentheader

      application-layerdata

      3 Transport Layer 9Comp 361 Spring 2005

      How demultiplexing workshost receives IP datagrams

      each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

      host uses IP addresses amp port numbers to direct segment to appropriate socket

      source port dest port

      32 bits

      applicationdata

      (message)

      other header fields

      TCPUDP segment format

      3 Transport Layer 10Comp 361 Spring 2005

      Connectionless demultiplexingWhen host receives UDP segment

      checks destination port number in segmentdirects UDP segment to socket with that port number

      IP datagrams with different source IP addresses andor source port numbers directed to same socket

      Create sockets with port numbers

      DatagramSocket mySocket1 = new DatagramSocket(99111)

      DatagramSocket mySocket2 = new DatagramSocket(99222)

      UDP socket identified by two-tuple

      (dest IP address dest port number)

      3 Transport Layer 11Comp 361 Spring 2005

      Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

      ClientIPB

      P3

      clientIP A

      P1P1P3

      serverIP C

      SP 6428DP 9157

      SP 9157DP 6428

      SP 6428DP 5775

      SP 5775DP 6428

      SP provides ldquoreturn addressrdquo

      3 Transport Layer 12Comp 361 Spring 2005

      Connection-oriented demux

      TCP socket identified by 4-tuple

      source IP addresssource port numberdest IP addressdest port number

      recv host uses all four values to direct segment to appropriate socket

      Server host may support many simultaneous TCP sockets

      each socket identified by its own 4-tuple

      Web servers have different sockets for each connecting client

      non-persistent HTTP will have different socket for each request

      3 Transport Layer 13Comp 361 Spring 2005

      Connection-oriented demux(cont)

      ClientIPB

      P3

      clientIP A

      P1P1P3

      serverIP C

      SP 80DP 9157

      SP 9157DP 80

      SP 80DP 5775

      SP 5775DP 80

      P4

      3 Transport Layer 14Comp 361 Spring 2005

      Connection-oriented demux Threaded Web Server

      ClientIPB

      P1

      clientIP A

      P1P2

      serverIP C

      SP 9157DP 80

      SP 9157DP 80

      P4 P3

      D-IPCS-IP AD-IPC

      S-IP B

      SP 5775DP 80

      D-IPCS-IP B

      3 Transport Layer 15Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 16Comp 361 Spring 2005

      UDP User Datagram Protocol [RFC 768]

      ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

      lostdelivered out of order to app

      connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

      Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

      3 Transport Layer 17Comp 361 Spring 2005

      UDP moreoften used for streaming multimedia apps

      loss tolerantrate sensitive

      other UDP uses (why)

      DNS small delaySNMP stressful cond

      reliable transfer over UDP add reliability at application layer

      application-specific error recover

      source port dest port

      32 bits

      Applicationdata

      (message)

      length checksumLength in

      bytes of UDPsegmentincluding

      header

      UDP segment format

      3 Transport Layer 18Comp 361 Spring 2005

      UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

      segment

      Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

      NO - error detectedYES - no error detected But maybe errors nonetheless More later

      Receiver may choose to discard segment or send a warning to app in case error

      Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

      3 Transport Layer 19Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 20Comp 361 Spring 2005

      Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

      characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

      3 Transport Layer 21Comp 361 Spring 2005

      Reliable data transfer getting started

      sendside

      receiveside

      rdt_send() called from above (eg by app) Passed data to

      deliver to receiver upper layer

      udt_send() called by rdtto transfer packet over

      unreliable channel to receiver

      rdt_rcv() called when packet arrives on rcv-side of channel

      deliver_data() called by rdt to deliver data to upper

      3 Transport Layer 22Comp 361 Spring 2005

      Reliable data transfer getting startedWersquoll

      incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

      but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

      state1

      state2

      event causing state transitionactions taken on state transition

      state when in this ldquostaterdquo next state

      uniquely determined by next event

      eventactions

      3 Transport Layer 23Comp 361 Spring 2005

      Incremental Improvements

      rdt10 assumes every packet sent arrives and no errors introduced in transmission

      rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

      rdt21 deals with corrupted ACKSNAKS

      rdt22 like rdt21 but does not need NAKs

      Rdt30 Allows packets to be lost

      Rdt10 reliable transfer over a reliable channel

      underlying channel perfectly reliableno bit errorsno loss of packets

      separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

      Wait for call from above packet = make_pkt(data)

      udt_send(packet)

      rdt_send(data)extract (packetdata)deliver_data(data)

      Wait for call from

      below

      rdt_rcv(packet)

      sender receiver

      3 Transport Layer 24Comp 361 Spring 2005

      3 Transport Layer 25Comp 361 Spring 2005

      Rdt20 channel with bit errors

      underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

      the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

      new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

      3 Transport Layer 26Comp 361 Spring 2005

      rdt20 FSM specification

      Wait for call from above

      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

      udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

      udt_send(NAK)

      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

      Wait for ACK or

      NAK

      rdt_send(data)

      receiver

      Wait for call from

      below

      Λ

      sender

      3 Transport Layer 27Comp 361 Spring 2005

      rdt20 operation with no errors

      Wait for call from above

      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

      udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

      udt_send(NAK)

      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

      Wait for ACK or

      NAK

      Wait for call from

      below

      rdt_send(data)

      Λ

      3 Transport Layer 28Comp 361 Spring 2005

      rdt20 error scenario

      Wait for call from above

      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

      udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

      udt_send(NAK)

      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

      Wait for ACK or

      NAK

      Wait for call from

      below

      rdt_send(data)

      Λ

      3 Transport Layer 29Comp 361 Spring 2005

      rdt20 has a fatal flawWhat happens if ACKNAK

      corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

      What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

      Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

      Sender sends one packet then waits for receiver response

      stop and wait

      3 Transport Layer 30Comp 361 Spring 2005

      Sender whenever sender receives control message it sends a packet to receiver

      A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

      Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

      Note ACKNAK do not contain sequence

      3 Transport Layer 31Comp 361 Spring 2005

      rdt21 sender handles garbled ACKNAKs

      Wait for call 0 from

      above

      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

      rdt_send(data)

      Wait for ACK or NAK 0 udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

      rdt_send(data)

      udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

      Wait forcall 1 from

      above

      Wait for ACK or NAK 1

      ΛΛ

      3 Transport Layer 32Comp 361 Spring 2005

      rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

      ampamp has_seq0(rcvpkt)

      Wait for 0 from below

      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

      Wait for 1 from below

      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

      3 Transport Layer 33Comp 361 Spring 2005

      rdt21 discussion

      Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

      state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

      Receivermust check if received packet is duplicate

      state indicates whether 0 or 1 is expected pkt seq

      note receiver can notknow if its last ACKNAK received OK at sender

      3 Transport Layer 34Comp 361 Spring 2005

      rdt22 a NAK-free protocol

      same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

      receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

      duplicate ACK at sender results in same action as NAK retransmit current pkt

      3 Transport Layer 35Comp 361 Spring 2005

      rdt22 sender receiver fragments

      Wait for call 0 from

      above

      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

      rdt_send(data)

      udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

      isACK(rcvpkt1) )

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

      Wait for ACK

      0sender FSM

      fragment

      Wait for 0 from below

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

      has_seq1(rcvpkt))

      udt_send(sndpkt)receiver FSM

      fragment

      Λ

      3 Transport Layer 36Comp 361 Spring 2005

      rdt30 channels with errors and loss

      New assumptionunderlying channel can also lose packets (data or ACKs)

      checksum seq ACKs retransmissions will be of help but not enough

      Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

      Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

      retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

      requires countdown timer

      3 Transport Layer 37Comp 361 Spring 2005

      rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

      rdt_send(data)

      Wait for

      ACK0

      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

      Wait for call 1 from

      above

      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

      rdt_send(data)

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

      stop_timerstop_timer

      udt_send(sndpkt)start_timer

      timeout

      udt_send(sndpkt)start_timer

      timeout

      rdt_rcv(rcvpkt)

      Wait for call 0from

      above

      Wait for

      ACK1

      Λrdt_rcv(rcvpkt)

      ΛΛ

      Λ

      3 Transport Layer 38Comp 361 Spring 2005

      rdt30 in action

      3 Transport Layer 39Comp 361 Spring 2005

      rdt30 in action

      3 Transport Layer 40Comp 361 Spring 2005

      Performance of rdt30

      rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

      L (packet length in bits)R (transmission rate bps)

      8kbpkt109 bsec

      Ttransmit = = = 8 microsec

      U sender =

      00830008

      = 000027 L R RTT + L R

      =

      U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

      rdt30 stop-and-wait operation

      first packet bit transmitted t = 0

      sender receiver

      RTT

      last packet bit transmitted t = L R

      first packet bit arriveslast packet bit arrives send ACK

      ACK arrives send next packet t = RTT + L R

      U sender =

      008 30008

      = 000027 L R RTT + L R

      =

      3 Transport Layer 41Comp 361 Spring 2005

      3 Transport Layer 42Comp 361 Spring 2005

      Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

      range of sequence numbers must be increasedbuffering at sender andor receiver

      3 Transport Layer 43Comp 361 Spring 2005

      Pipelined protocols

      Advantage much better bandwidth utilization than stop-and-wait

      Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

      Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

      Note TCP is not exactly either

      Pipelining increased utilization

      first packet bit transmitted t = 0

      sender receiver

      RTT

      last bit transmitted t = L R

      first packet bit arriveslast packet bit arrives send ACK

      ACK arrives send next packet t = RTT + L R

      last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

      U sender =

      02430008

      = 00008 3 L R RTT + L R

      =

      Increase utilizationby a factor of 3

      3 Transport Layer 44Comp 361 Spring 2005

      3 Transport Layer 45Comp 361 Spring 2005

      Go-Back-NSender

      k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

      ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

      Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

      3 Transport Layer 46Comp 361 Spring 2005

      GBN Sender

      rdt_Send() called checks to see if window is full No send out packetYes return data to application level

      Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

      Timeout resends ALL packets that have been sent but not yet acknowledged

      This is only event that triggers resend

      3 Transport Layer 47Comp 361 Spring 2005

      GBN sender extended FSMrdt_send(data)

      Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

      timeout

      if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

      start_timernextseqnum++

      elserefuse_data(data)

      base = getacknum(rcvpkt)+1If (base == nextseqnum)

      stop_timerelse

      start_timer

      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

      base=1nextseqnum=1

      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

      Λ

      3 Transport Layer 48Comp 361 Spring 2005

      GBN receiver extended FSM

      Wait

      udt_send(sndpkt)default

      rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

      expectedseqnum=1sndpkt =

      make_pkt(0ACKchksum)

      Λ

      If expected packet receivedSend ACK and deliver packet upstairs

      If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

      3 Transport Layer 49Comp 361 Spring 2005

      More on receiver

      The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

      3 Transport Layer 50Comp 361 Spring 2005

      GBN inaction

      GBN is easy to code but might have performance problems

      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

      3 Transport Layer 51Comp 361 Spring 2005

      3 Transport Layer 52Comp 361 Spring 2005

      Selective Repeat

      receiver individually acknowledges all correctly received pkts

      buffers pkts as needed for eventual in-order delivery to upper layer

      sender only resends pkts for which ACK not received

      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

      3 Transport Layer 53Comp 361 Spring 2005

      Selective repeat sender receiver windows

      3 Transport Layer 54Comp 361 Spring 2005

      Selective repeat

      pkt n in [rcvbase rcvbase+N-1]

      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

      pkt n in [rcvbase-Nrcvbase-1]

      ACK(n) (note this is a reACK)

      otherwiseignore

      receiverdata from above

      if next available seq in window send pkt

      timeout(n)resend pkt n restart timer

      ACK(n) in [sendbasesendbase+N]

      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

      sender

      3 Transport Layer 55Comp 361 Spring 2005

      Selective repeat in action

      3 Transport Layer 56Comp 361 Spring 2005

      Selective repeatdilemma

      Example seq rsquos 0 1 2 3window size=3

      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

      Q what is relationship between seq size and window size

      3 Transport Layer 57Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 58Comp 361 Spring 2005

      TCP Overview RFCs 793 1122 1323 2018 2581

      full duplex databi-directional data flow in same connectionMSS maximum segment size

      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

      flow controlledsender will not overwhelm receiver

      point-to-pointone sender one receiver

      reliable in-order byte steam

      no ldquomessage boundariesrdquopipelined

      TCP congestion and flow control set window size

      send amp receive buffers

      socketdoor

      TCPsend buffer

      TCPreceive buffer

      socketdoor

      segment

      applicationwrites data

      applicationreads data

      3 Transport Layer 59Comp 361 Spring 2005

      More TCP DetailsMaximum Segment Size (MSS)

      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

      Application Data + TCP Header = TCP Segment

      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

      (again no payload)Client responds with third special segment

      This can contain payload

      3 Transport Layer 60Comp 361 Spring 2005

      Even More TCP Details

      A TCP connection between client and server creates in both client and server

      (i) buffers(ii) variables and

      (iii) a socket connection to process

      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

      any of the network elements between the host and server

      3 Transport Layer 61Comp 361 Spring 2005

      TCP segment structure

      source port dest port

      32 bits

      applicationdata

      (variable length)

      sequence numberacknowledgement number

      Receive windowUrg data pnterchecksum

      FSRPAUheadlen

      notused

      Options (variable length)

      URG urgent data (generally not used)

      ACK ACK valid

      PSH push data now(generally not used)

      RST SYN FINconnection estab(setup teardown

      commands)

      bytes rcvr willingto accept

      Internetchecksum

      (as in UDP)

      countingby bytes of data(not segments)

      3 Transport Layer 62Comp 361 Spring 2005

      TCP seq rsquos and ACKsSeq rsquos

      byte stream ldquonumberrdquo of first byte in segmentrsquos data

      ACKsseq of next byte expected from other sidecumulative ACK

      Q how receiver handles out-of-order segments

      A TCP spec doesnrsquot say - up to implementer

      Host BHost A

      Seq=42 ACK=79 data = lsquoCrsquo

      Seq=79 ACK=43 data = lsquoCrsquo

      Seq=43 ACK=80

      Usertypes

      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

      back lsquoCrsquo

      host ACKsreceipt

      of echoedlsquoCrsquo

      timesimple telnet scenario

      3 Transport Layer 63Comp 361 Spring 2005

      TCP Round Trip Time and Timeout

      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

      average several recent measurements not just current SampleRTT

      Q how to set TCP timeout valuelonger than RTT

      but RTT variestoo short premature timeout

      unnecessary retransmissions

      too long slow reaction to segment loss

      3 Transport Layer 64Comp 361 Spring 2005

      TCP Round Trip Time and Timeout

      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

      3 Transport Layer 65Comp 361 Spring 2005

      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

      100

      150

      200

      250

      300

      350

      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

      time (seconnds)

      RTT

      (mill

      iseco

      nds)

      SampleRTT Estimated RTT

      3 Transport Layer 66Comp 361 Spring 2005

      TCP Round Trip Time and Timeout

      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

      (typically β = 025)

      Then set timeout interval

      TimeoutInterval = EstimatedRTT + 4DevRTT

      3 Transport Layer 67Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 68Comp 361 Spring 2005

      TCP reliable data transfer

      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

      Retransmissions are triggered by

      timeout eventsduplicate acks

      Initially consider simplified TCP sender

      ignore duplicate acksignore flow control congestion control

      3 Transport Layer 69Comp 361 Spring 2005

      TCP sender eventsdata rcvd from app

      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

      timeoutretransmit segment that caused timeoutrestart timer

      Ack rcvdIf acknowledges previously unackedsegments

      update what is known to be ackedstart timer if there are outstanding segments

      TCP sender(simplified)

      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

      loop (forever) switch(event)

      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

      event timer timeoutretransmit not-yet-acknowledged segment with

      smallest sequence numberstart timer

      event ACK received with ACK field value of y if (y gt SendBase)

      SendBase = yif (there are currently not-yet-acknowledged segments)

      start timer

      end of loop forever

      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

      3 Transport Layer 70Comp 361 Spring 2005

      3 Transport Layer 71Comp 361 Spring 2005

      TCP retransmission scenariosHost A

      Seq=100 20 bytes data

      ACK=100

      timepremature timeout

      Host B

      Seq=92 8 bytes data

      ACK=120

      Seq=92 8 bytes data

      Seq=

      92 t

      imeo

      ut

      ACK=120

      Host A

      Seq=92 8 bytes data

      ACK=100

      loss

      tim

      eout

      lost ACK scenario

      Host B

      X

      Seq=92 8 bytes data

      ACK=100

      time

      SendBase= 120

      SendBase= 120

      Sendbase= 100

      Seq=

      92 t

      imeo

      utSendBase

      = 100

      3 Transport Layer 72Comp 361 Spring 2005

      TCP retransmission scenarios (more)Host A

      Seq=92 8 bytes data

      ACK=100

      loss

      tim

      eout

      Cumulative ACK scenario

      Host B

      X

      Seq=100 20 bytes data

      ACK=120

      time

      SendBase= 120

      3 Transport Layer 73Comp 361 Spring 2005

      TCP ACK generation [RFC 1122 RFC 2581]

      Event at Receiver

      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

      Arrival of in-order segment withexpected seq One other segment has ACK pending

      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

      Arrival of segment that partially or completely fills gap

      TCP Receiver action

      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

      Immediately send single cumulative ACK ACKing both in-order segments

      Immediately send duplicate ACK indicating seq of next expected byte

      Immediate send ACK provided thatsegment starts at lower end of gap

      3 Transport Layer 74Comp 361 Spring 2005

      More on Sender Policies

      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

      3 Transport Layer 75Comp 361 Spring 2005

      Fast Retransmit

      Time-out period often relatively long

      long delay before resending lost packet

      Detect lost segments via duplicate ACKs

      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

      fast retransmit resend segment before timer expires

      3 Transport Layer 76Comp 361 Spring 2005

      Fast retransmit algorithm

      event ACK received with ACK field value of y if (y gt SendBase)

      SendBase = yif (there are currently not-yet-acknowledged segments)

      start timer

      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

      resend segment with sequence number y

      a duplicate ACK for already ACKed segment

      fast retransmit

      3 Transport Layer 77Comp 361 Spring 2005

      TCP GBN or Selective Repeat

      Basic TCP looks a lot like GBN

      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

      This looks a lot like Selective Repeat

      TCP is a hybrid

      3 Transport Layer 78Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 79Comp 361 Spring 2005

      TCP Flow Control

      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

      3 Transport Layer 80Comp 361 Spring 2005

      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

      transmitting too muchtoo fast

      flow controlreceive side of TCP connection has a receive buffer

      speed-matching service matching the send rate to the receiving apprsquos drain rate

      app process may be slow at reading from buffer

      3 Transport Layer 81Comp 361 Spring 2005

      TCP segment structure

      source port dest port

      32 bits

      applicationdata

      (variable length)

      sequence numberacknowledgement number

      Receive windowUrg data pnterchecksum

      FSRPAUheadlen

      notused

      Options (variable length)

      URG urgent data (generally not used)

      ACK ACK valid

      PSH push data now(generally not used)

      RST SYN FINconnection estab(setup teardown

      commands)

      bytes rcvr willingto accept

      Internetchecksum

      (as in UDP)

      countingby bytes of data(not segments)

      3 Transport Layer 82Comp 361 Spring 2005

      TCP Flow control how it works

      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

      = RcvWindow= RcvBuffer-[LastByteRcvd -

      LastByteRead]

      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

      guarantees receive buffer doesnrsquot overflow

      3 Transport Layer 83Comp 361 Spring 2005

      Technical Issue

      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

      3 Transport Layer 84Comp 361 Spring 2005

      Note on UDP

      UDP has no flow control

      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

      3 Transport Layer 85Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 86Comp 361 Spring 2005

      TCP Connection Management

      Three way handshakeStep 1 client end system sends

      TCP SYN control segment to server

      specifies client_isn the initial seq No application data

      Step 2 server end system receives SYN replies with SYNACK control segment

      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

      seq sbuffers flow control info (eg RcvWindow)

      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

      3 Transport Layer 87Comp 361 Spring 2005

      TCP Connection Management (cont)

      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

      Allocate buffersAllocates buffersCan include application data

      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

      clientConnection request (SYN=1 seq=client_isn)

      server

      Connection granted (SYN=1 server_isn

      ACK (SYN=0 seq=client_isn+1)

      ack=client_isn+1)

      ack=server_isn+1

      3 Transport Layer 88Comp 361 Spring 2005

      TCP Connection Management (cont)

      Closing a connection

      client closes socketclientSocketclose()

      Step 1 client end system sends TCP FIN control segment to server

      Step 2 server receives FIN replies with ACK Closes connection sends FIN

      client

      FIN

      server

      ACK

      ACK

      FIN

      close

      close

      closed

      tim

      ed w

      ait

      3 Transport Layer 89Comp 361 Spring 2005

      TCP Connection Management (cont)

      Step 3 client receives FIN replies with ACK

      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

      Closes down after timed-wait

      Step 4 server receives ACK Connection closed

      Note with small modification can handle simultaneous FINs

      client

      FIN

      server

      ACK

      ACK

      FIN

      closing

      closing

      closed

      tim

      ed w

      ait

      closed

      3 Transport Layer 90Comp 361 Spring 2005

      TCP Connection Management (cont)

      ExampleTCP serverlifecycle

      Example TCP clientlifecycle

      3 Transport Layer 91Comp 361 Spring 2005

      A few special cases

      Have not discussed what happens if both client and server decide to close down connection at same time

      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

      3 Transport Layer 92Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 93Comp 361 Spring 2005

      Principles of Congestion Control

      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

      a top-10 problem

      3 Transport Layer 94Comp 361 Spring 2005

      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

      large delays when congestedmaximum achievable throughput

      3 Transport Layer 95Comp 361 Spring 2005

      Causescosts of congestion scenario 2

      one router finite buffers sender retransmission of lost packet

      3 Transport Layer 96Comp 361 Spring 2005

      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

      λin λout=

      λin λoutgtλ

      inλout

      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

      (c)(a) (b)

      3 Transport Layer 97Comp 361 Spring 2005

      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

      λin

      Q what happens as and increase λ

      in

      3 Transport Layer 98Comp 361 Spring 2005

      Causescosts of congestion scenario 3

      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

      3 Transport Layer 99Comp 361 Spring 2005

      Approaches towards congestion control

      Two broad approaches towards congestion control

      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

      Network-assisted congestion controlrouters provide feedback to end systems

      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

      3 Transport Layer 100Comp 361 Spring 2005

      Case study ATM ABR congestion control

      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

      RM cells returned to sender by receiver with bits intact

      small exception ndash see next page

      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

      sender should use available bandwidth

      if senderrsquos path congested sender throttled to minimum guaranteed rate

      3 Transport Layer 101Comp 361 Spring 2005

      Case study ATM ABR congestion control

      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

      3 Transport Layer 102Comp 361 Spring 2005

      Chapter 3 outline

      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

      35 Connection-oriented transport TCP

      segment structurereliable data transferflow controlconnection management

      36 Principles of congestion control37 TCP congestion control

      3 Transport Layer 103Comp 361 Spring 2005

      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

      Congwin

      w segments each with MSS bytes sent in one RTT

      throughput = w MSSRTT Bytessec

      3 Transport Layer 104Comp 361 Spring 2005

      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

      Tools are ldquosimilarrdquo to flow control sender limits transmission using

      LastByteSent-LastByteAcked le CongWin

      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

      3 Transport Layer 105Comp 361 Spring 2005

      TCP AIMDmultiplicative decrease additive increase increase

      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

      cut CongWin in half after loss event

      8 Kbytes

      16 Kbytes

      24 Kbytes

      time

      congestionwindow

      Long-lived TCP connection

      3 Transport Layer 106Comp 361 Spring 2005

      TCP Slow Start

      When connection begins CongWin = 1 MSS

      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

      available bandwidth may be gtgt MSSRTT

      desirable to quickly ramp up to respectable rate

      When connection begins increase rate exponentially fast until first loss event

      3 Transport Layer 107Comp 361 Spring 2005

      TCP Slow Start (more)

      When connection begins increase rate exponentially until first loss event

      double CongWin every RTTdone by incrementing CongWin for every ACK received

      Summary initial rate is slow but ramps up exponentially fast

      Host A

      one segment

      RTT

      Host B

      time

      two segments

      four segments

      3 Transport Layer 108Comp 361 Spring 2005

      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

      3 Transport Layer 109Comp 361 Spring 2005

      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

      3 Transport Layer 110Comp 361 Spring 2005

      Summary TCP Congestion Control

      When CongWin is below Threshold sender in slow-start phase window grows exponentially

      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

      3 Transport Layer 111Comp 361 Spring 2005

      The Big Picture

      3 Transport Layer 112Comp 361 Spring 2005

      TCP sender congestion controlEvent State TCP Sender Action Commentary

      ACK receipt for previously unackeddata

      Slow Start (SS)

      CongWin = CongWin + MSS If (CongWin gt Threshold)

      set state to ldquoCongestion Avoidancerdquo

      Resulting in a doubling of CongWin every RTT

      ACK receipt for previously unackeddata

      CongestionAvoidance (CA)

      CongWin = CongWin+MSS (MSSCongWin)

      Additive increase resulting in increase of CongWin by 1 MSS every RTT

      Loss event detected by triple duplicate ACK

      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

      Enter slow start

      Duplicate ACK

      SS or CA Increment duplicate ACK count for segment being acked

      CongWin and Threshold not changed

      3 Transport Layer 113Comp 361 Spring 2005

      TCP throughput

      Whatrsquos the average throughput of TCP as a function of window size and RTT

      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

      3 Transport Layer 114Comp 361 Spring 2005

      TCP Futures

      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

      L = 210-10 WowNew versions of TCP for high-speed needed

      LRTTMSSsdot221

      3 Transport Layer 115Comp 361 Spring 2005

      TCP FairnessFairness goal if K TCP sessions share same

      bottleneck link of bandwidth R each should have average rate of RK

      TCP connection 1

      bottleneckrouter

      capacity R

      TCP connection 2

      3 Transport Layer 116Comp 361 Spring 2005

      Why is TCP fairTwo competing sessions

      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

      R

      R

      equal bandwidth share

      Connection 1 throughput

      Conn

      ecti

      on 2

      thr

      ough

      p ut

      congestion avoidance additive increaseloss decrease window by factor of 2

      congestion avoidance additive increaseloss decrease window by factor of 2

      3 Transport Layer 117Comp 361 Spring 2005

      Fairness (more)Fairness and UDP

      Multimedia apps often do not use TCP

      do not want rate throttled by congestion control

      Instead use UDPpump audiovideo at constant rate tolerate packet loss

      Current Research area How to keep UDP from congesting the internet

      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

      3 Transport Layer 118Comp 361 Spring 2005

      TCP Latency ModelingNotation assumptions

      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

      modeling slow start

      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

      3 Transport Layer 119Comp 361 Spring 2005

      Fixed Congestion Window (W)Two cases

      1 WSR gt RTT + SR ACK for first segment in window returns before

      windowrsquos worth of data sentLatency = 2RTT + OR

      2 WSR lt RTT + SR ACK for first segment in window returns after

      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

      3 Transport Layer 120Comp 361 Spring 2005

      Fixed congestion window (1)

      First caseWSR gt RTT + SR ACK for

      first segment in window returns before windowrsquos worth of data sent

      latency = 2RTT + OR

      3 Transport Layer 121Comp 361 Spring 2005

      Fixed congestion window (2)

      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

      3 Transport Layer 122Comp 361 Spring 2005

      TCP Latency Modeling Slow Start (1)

      Now suppose window grows according to slow start(with no threshold and no loss events)

      Will show that the delay for one object is

      RS

      RSRTTP

      RORTTLatency P )12(2 minusminus⎥⎦

      ⎤⎢⎣⎡ +++=

      where P is the number of times TCP idles at server1min minus= KQP

      - where Q is the number of times the server idlesif the object were of infinite size

      - and K is the number of windows that cover the object

      3 Transport Layer 123Comp 361 Spring 2005

      TCP Latency Modeling Slow Start (2)

      RTT

      initiate TCPconnection

      requestobject

      first window= SR

      second window= 2SR

      third window= 4SR

      fourth window= 8SR

      completetransmissionobject

      delivered

      time atclient

      time atserver

      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

      Server idles P=2 times

      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

      Server idles P = minK-1Q times

      3 Transport Layer 124Comp 361 Spring 2005

      TCP Latency Modeling (3)

      ementacknowledg receivesserver until

      segment send tostartsserver whenfrom time=+ RTTRS

      RS

      RSRTTPRTT

      RO

      RSRTT

      RSRTT

      RO

      idleTimeRTTRO

      P

      kP

      k

      P

      pp

      )12(][2

      ]2[2

      2delay

      1

      1

      1

      minusminus+++=

      minus+++=

      ++=

      minus

      =

      =

      sum

      sum

      th window after the timeidle 2 1 kRSRTT

      RS k =⎥⎦

      ⎤⎢⎣⎡ minus+

      +minus

      window kth the transmit totime2 1 =minus

      RSk

      RTT

      initiate TCPconnection

      requestobject

      first window= SR

      second window= 2SR

      third window= 4SR

      fourth window= 8SR

      completetransmissionobject

      delivered

      time atclient

      time atserver

      3 Transport Layer 125Comp 361 Spring 2005

      TCP Latency Modeling (4)Recall K = number of windows that cover object

      How do we calculate K

      ⎥⎥⎤

      ⎢⎢⎡ +=

      +ge=

      geminus=

      ge+++=

      ge+++=minus

      minus

      )1(log

      )1(logmin

      12min

      222min222min

      2

      2

      110

      110

      SO

      SOkk

      SOk

      SOkOSSSkK

      k

      k

      k

      L

      L

      Calculation of Q number of idles for infinite-size objectis similar

      3 Transport Layer 126Comp 361 Spring 2005

      HTTP ModelingAssume Web page consists of

      1 base HTML page (of size O bits)M images (each of size O bits)

      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

      3 Transport Layer 127Comp 361 Spring 2005

      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

      02468

      101214161820

      28Kbps

      100Kbps

      1 Mbps 10Mbps

      non-persistent

      persistent

      parallel non-persistent

      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

      3 Transport Layer 128Comp 361 Spring 2005

      HTTP Response time (in seconds)

      0

      10

      20

      30

      40

      50

      60

      70

      28Kbps

      100Kbps

      1 Mbps 10Mbps

      non-persistent

      persistent

      parallel non-persistent

      RTT =1 sec O = 5 Kbytes M=10 and X=5

      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

      3 Transport Layer 129Comp 361 Spring 2005

      Chapter 3 Summaryprinciples behind transport layer services

      multiplexing demultiplexingreliable data transferflow controlcongestion control

      instantiation and implementation in the Internet

      UDPTCP

      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

      • Chapter 3 Transport Layer last revised 160305
      • Chapter 3 outline
      • Transport services and protocols
      • Transport vs network layer
      • Transport-layer protocols
      • Chapter 3 outline
      • Multiplexingdemultiplexing
      • Multiplexingdemultiplexing
      • How demultiplexing works
      • Connectionless demultiplexing
      • Connectionless demux (cont)
      • Connection-oriented demux
      • Connection-oriented demux (cont)
      • Connection-oriented demux Threaded Web Server
      • Chapter 3 outline
      • UDP User Datagram Protocol [RFC 768]
      • UDP more
      • UDP checksum
      • Chapter 3 outline
      • Principles of Reliable data transfer
      • Reliable data transfer getting started
      • Reliable data transfer getting started
      • Incremental Improvements
      • Rdt10 reliable transfer over a reliable channel
      • Rdt20 channel with bit errors
      • rdt20 FSM specification
      • rdt20 operation with no errors
      • rdt20 error scenario
      • rdt20 has a fatal flaw
      • rdt21 sender handles garbled ACKNAKs
      • rdt21 receiver handles garbled ACKNAKs
      • rdt21 discussion
      • rdt22 a NAK-free protocol
      • rdt22 sender receiver fragments
      • rdt30 channels with errors and loss
      • rdt30 sender
      • rdt30 in action
      • rdt30 in action
      • Performance of rdt30
      • rdt30 stop-and-wait operation
      • Pipelined protocols
      • Pipelined protocols
      • Pipelining increased utilization
      • Go-Back-N
      • GBN Sender
      • GBN sender extended FSM
      • GBN receiver extended FSM
      • More on receiver
      • GBN inaction
      • Selective Repeat
      • Selective repeat sender receiver windows
      • Selective repeat
      • Selective repeat in action
      • Selective repeat dilemma
      • Chapter 3 outline
      • TCP Overview RFCs 793 1122 1323 2018 2581
      • More TCP Details
      • Even More TCP Details
      • TCP segment structure
      • TCP seq rsquos and ACKs
      • TCP Round Trip Time and Timeout
      • TCP Round Trip Time and Timeout
      • Example RTT estimation
      • TCP Round Trip Time and Timeout
      • Chapter 3 outline
      • TCP reliable data transfer
      • TCP sender events
      • TCP sender(simplified)
      • TCP retransmission scenarios
      • TCP retransmission scenarios (more)
      • TCP ACK generation [RFC 1122 RFC 2581]
      • More on Sender Policies
      • Fast Retransmit
      • Fast retransmit algorithm
      • TCP GBN or Selective Repeat
      • Chapter 3 outline
      • TCP Flow Control
      • TCP Flow Control
      • TCP segment structure
      • TCP Flow control how it works
      • Technical Issue
      • Chapter 3 outline
      • TCP Connection Management
      • TCP Connection Management (cont)
      • TCP Connection Management (cont)
      • TCP Connection Management (cont)
      • TCP Connection Management (cont)
      • A few special cases
      • Chapter 3 outline
      • Principles of Congestion Control
      • Causescosts of congestion scenario 1
      • Causescosts of congestion scenario 2
      • Causescosts of congestion scenario 3
      • Causescosts of congestion scenario 3
      • Approaches towards congestion control
      • Case study ATM ABR congestion control
      • Case study ATM ABR congestion control
      • Chapter 3 outline
      • TCP Congestion Control
      • TCP AIMD
      • TCP Slow Start
      • TCP Slow Start (more)
      • Summary TCP Congestion Control
      • The Big Picture
      • TCP sender congestion control
      • TCP throughput
      • TCP Futures
      • TCP Fairness
      • Why is TCP fair
      • Fairness (more)
      • TCP Latency Modeling
      • Fixed Congestion Window (W)
      • Fixed congestion window (1)
      • Fixed congestion window (2)
      • TCP Latency Modeling Slow Start (1)
      • TCP Latency Modeling Slow Start (2)
      • TCP Latency Modeling (3)
      • TCP Latency Modeling (4)
      • HTTP Modeling
      • Chapter 3 Summary

        3 Transport Layer 4Comp 361 Spring 2005

        Transport vs network layerHousehold analogy12 kids sending letters

        to 12 kidsprocesses = kidsapp messages = letters in envelopeshosts = housestransport protocol = Ann and Billnetwork-layer protocol = postal service

        network layer logical communication between hoststransport layer logical communication between processes

        relies on enhances network layer services

        3 Transport Layer 5Comp 361 Spring 2005

        Transport-layer protocols

        Internet transport servicesreliable in-order unicastdelivery (TCP)

        congestion flow controlconnection setup

        unreliable (ldquobest-effortrdquo) unordered unicast or multicast delivery UDPservices not available

        real-timebandwidth guaranteesreliable multicast

        applicationtransportnetworkdata linkphysical

        applicationtransportnetworkdata linkphysical

        networkdata linkphysical

        networkdata linkphysical

        networkdata linkphysical

        networkdata linkphysicalnetwork

        data linkphysical

        logical end-end transport

        3 Transport Layer 6Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 7Comp 361 Spring 2005

        Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

        Multiplexing at send host

        delivering received segmentsto correct socket

        Demultiplexing at rcv host

        = socket = process

        application

        transport

        network

        link

        physical

        P1 application

        transport

        network

        link

        physical

        application

        transport

        network

        link

        physical

        P2P3 P4P1

        host 1 host 2 host 3

        3 Transport Layer 8Comp 361 Spring 2005

        Multiplexingdemultiplexingsegment - unit of data

        exchanged between transport layer entities

        aka TPDU transport protocol data unit

        Demultiplexing delivering received segments to correct app layer processes

        receiver

        applicationtransportnetwork

        M P2applicationtransportnetwork

        HtHn segment

        segment Mapplicationtransportnetwork

        P1M

        M MP3 P4

        segmentheader

        application-layerdata

        3 Transport Layer 9Comp 361 Spring 2005

        How demultiplexing workshost receives IP datagrams

        each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

        host uses IP addresses amp port numbers to direct segment to appropriate socket

        source port dest port

        32 bits

        applicationdata

        (message)

        other header fields

        TCPUDP segment format

        3 Transport Layer 10Comp 361 Spring 2005

        Connectionless demultiplexingWhen host receives UDP segment

        checks destination port number in segmentdirects UDP segment to socket with that port number

        IP datagrams with different source IP addresses andor source port numbers directed to same socket

        Create sockets with port numbers

        DatagramSocket mySocket1 = new DatagramSocket(99111)

        DatagramSocket mySocket2 = new DatagramSocket(99222)

        UDP socket identified by two-tuple

        (dest IP address dest port number)

        3 Transport Layer 11Comp 361 Spring 2005

        Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

        ClientIPB

        P3

        clientIP A

        P1P1P3

        serverIP C

        SP 6428DP 9157

        SP 9157DP 6428

        SP 6428DP 5775

        SP 5775DP 6428

        SP provides ldquoreturn addressrdquo

        3 Transport Layer 12Comp 361 Spring 2005

        Connection-oriented demux

        TCP socket identified by 4-tuple

        source IP addresssource port numberdest IP addressdest port number

        recv host uses all four values to direct segment to appropriate socket

        Server host may support many simultaneous TCP sockets

        each socket identified by its own 4-tuple

        Web servers have different sockets for each connecting client

        non-persistent HTTP will have different socket for each request

        3 Transport Layer 13Comp 361 Spring 2005

        Connection-oriented demux(cont)

        ClientIPB

        P3

        clientIP A

        P1P1P3

        serverIP C

        SP 80DP 9157

        SP 9157DP 80

        SP 80DP 5775

        SP 5775DP 80

        P4

        3 Transport Layer 14Comp 361 Spring 2005

        Connection-oriented demux Threaded Web Server

        ClientIPB

        P1

        clientIP A

        P1P2

        serverIP C

        SP 9157DP 80

        SP 9157DP 80

        P4 P3

        D-IPCS-IP AD-IPC

        S-IP B

        SP 5775DP 80

        D-IPCS-IP B

        3 Transport Layer 15Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 16Comp 361 Spring 2005

        UDP User Datagram Protocol [RFC 768]

        ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

        lostdelivered out of order to app

        connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

        Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

        3 Transport Layer 17Comp 361 Spring 2005

        UDP moreoften used for streaming multimedia apps

        loss tolerantrate sensitive

        other UDP uses (why)

        DNS small delaySNMP stressful cond

        reliable transfer over UDP add reliability at application layer

        application-specific error recover

        source port dest port

        32 bits

        Applicationdata

        (message)

        length checksumLength in

        bytes of UDPsegmentincluding

        header

        UDP segment format

        3 Transport Layer 18Comp 361 Spring 2005

        UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

        segment

        Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

        NO - error detectedYES - no error detected But maybe errors nonetheless More later

        Receiver may choose to discard segment or send a warning to app in case error

        Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

        3 Transport Layer 19Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 20Comp 361 Spring 2005

        Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

        characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

        3 Transport Layer 21Comp 361 Spring 2005

        Reliable data transfer getting started

        sendside

        receiveside

        rdt_send() called from above (eg by app) Passed data to

        deliver to receiver upper layer

        udt_send() called by rdtto transfer packet over

        unreliable channel to receiver

        rdt_rcv() called when packet arrives on rcv-side of channel

        deliver_data() called by rdt to deliver data to upper

        3 Transport Layer 22Comp 361 Spring 2005

        Reliable data transfer getting startedWersquoll

        incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

        but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

        state1

        state2

        event causing state transitionactions taken on state transition

        state when in this ldquostaterdquo next state

        uniquely determined by next event

        eventactions

        3 Transport Layer 23Comp 361 Spring 2005

        Incremental Improvements

        rdt10 assumes every packet sent arrives and no errors introduced in transmission

        rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

        rdt21 deals with corrupted ACKSNAKS

        rdt22 like rdt21 but does not need NAKs

        Rdt30 Allows packets to be lost

        Rdt10 reliable transfer over a reliable channel

        underlying channel perfectly reliableno bit errorsno loss of packets

        separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

        Wait for call from above packet = make_pkt(data)

        udt_send(packet)

        rdt_send(data)extract (packetdata)deliver_data(data)

        Wait for call from

        below

        rdt_rcv(packet)

        sender receiver

        3 Transport Layer 24Comp 361 Spring 2005

        3 Transport Layer 25Comp 361 Spring 2005

        Rdt20 channel with bit errors

        underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

        the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

        new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

        3 Transport Layer 26Comp 361 Spring 2005

        rdt20 FSM specification

        Wait for call from above

        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

        udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

        udt_send(NAK)

        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

        Wait for ACK or

        NAK

        rdt_send(data)

        receiver

        Wait for call from

        below

        Λ

        sender

        3 Transport Layer 27Comp 361 Spring 2005

        rdt20 operation with no errors

        Wait for call from above

        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

        udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

        udt_send(NAK)

        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

        Wait for ACK or

        NAK

        Wait for call from

        below

        rdt_send(data)

        Λ

        3 Transport Layer 28Comp 361 Spring 2005

        rdt20 error scenario

        Wait for call from above

        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

        udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

        udt_send(NAK)

        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

        Wait for ACK or

        NAK

        Wait for call from

        below

        rdt_send(data)

        Λ

        3 Transport Layer 29Comp 361 Spring 2005

        rdt20 has a fatal flawWhat happens if ACKNAK

        corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

        What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

        Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

        Sender sends one packet then waits for receiver response

        stop and wait

        3 Transport Layer 30Comp 361 Spring 2005

        Sender whenever sender receives control message it sends a packet to receiver

        A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

        Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

        Note ACKNAK do not contain sequence

        3 Transport Layer 31Comp 361 Spring 2005

        rdt21 sender handles garbled ACKNAKs

        Wait for call 0 from

        above

        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

        rdt_send(data)

        Wait for ACK or NAK 0 udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

        rdt_send(data)

        udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

        Wait forcall 1 from

        above

        Wait for ACK or NAK 1

        ΛΛ

        3 Transport Layer 32Comp 361 Spring 2005

        rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

        ampamp has_seq0(rcvpkt)

        Wait for 0 from below

        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

        Wait for 1 from below

        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

        3 Transport Layer 33Comp 361 Spring 2005

        rdt21 discussion

        Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

        state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

        Receivermust check if received packet is duplicate

        state indicates whether 0 or 1 is expected pkt seq

        note receiver can notknow if its last ACKNAK received OK at sender

        3 Transport Layer 34Comp 361 Spring 2005

        rdt22 a NAK-free protocol

        same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

        receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

        duplicate ACK at sender results in same action as NAK retransmit current pkt

        3 Transport Layer 35Comp 361 Spring 2005

        rdt22 sender receiver fragments

        Wait for call 0 from

        above

        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

        rdt_send(data)

        udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

        isACK(rcvpkt1) )

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

        Wait for ACK

        0sender FSM

        fragment

        Wait for 0 from below

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

        has_seq1(rcvpkt))

        udt_send(sndpkt)receiver FSM

        fragment

        Λ

        3 Transport Layer 36Comp 361 Spring 2005

        rdt30 channels with errors and loss

        New assumptionunderlying channel can also lose packets (data or ACKs)

        checksum seq ACKs retransmissions will be of help but not enough

        Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

        Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

        retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

        requires countdown timer

        3 Transport Layer 37Comp 361 Spring 2005

        rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

        rdt_send(data)

        Wait for

        ACK0

        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

        Wait for call 1 from

        above

        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

        rdt_send(data)

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

        stop_timerstop_timer

        udt_send(sndpkt)start_timer

        timeout

        udt_send(sndpkt)start_timer

        timeout

        rdt_rcv(rcvpkt)

        Wait for call 0from

        above

        Wait for

        ACK1

        Λrdt_rcv(rcvpkt)

        ΛΛ

        Λ

        3 Transport Layer 38Comp 361 Spring 2005

        rdt30 in action

        3 Transport Layer 39Comp 361 Spring 2005

        rdt30 in action

        3 Transport Layer 40Comp 361 Spring 2005

        Performance of rdt30

        rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

        L (packet length in bits)R (transmission rate bps)

        8kbpkt109 bsec

        Ttransmit = = = 8 microsec

        U sender =

        00830008

        = 000027 L R RTT + L R

        =

        U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

        rdt30 stop-and-wait operation

        first packet bit transmitted t = 0

        sender receiver

        RTT

        last packet bit transmitted t = L R

        first packet bit arriveslast packet bit arrives send ACK

        ACK arrives send next packet t = RTT + L R

        U sender =

        008 30008

        = 000027 L R RTT + L R

        =

        3 Transport Layer 41Comp 361 Spring 2005

        3 Transport Layer 42Comp 361 Spring 2005

        Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

        range of sequence numbers must be increasedbuffering at sender andor receiver

        3 Transport Layer 43Comp 361 Spring 2005

        Pipelined protocols

        Advantage much better bandwidth utilization than stop-and-wait

        Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

        Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

        Note TCP is not exactly either

        Pipelining increased utilization

        first packet bit transmitted t = 0

        sender receiver

        RTT

        last bit transmitted t = L R

        first packet bit arriveslast packet bit arrives send ACK

        ACK arrives send next packet t = RTT + L R

        last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

        U sender =

        02430008

        = 00008 3 L R RTT + L R

        =

        Increase utilizationby a factor of 3

        3 Transport Layer 44Comp 361 Spring 2005

        3 Transport Layer 45Comp 361 Spring 2005

        Go-Back-NSender

        k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

        ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

        Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

        3 Transport Layer 46Comp 361 Spring 2005

        GBN Sender

        rdt_Send() called checks to see if window is full No send out packetYes return data to application level

        Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

        Timeout resends ALL packets that have been sent but not yet acknowledged

        This is only event that triggers resend

        3 Transport Layer 47Comp 361 Spring 2005

        GBN sender extended FSMrdt_send(data)

        Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

        timeout

        if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

        start_timernextseqnum++

        elserefuse_data(data)

        base = getacknum(rcvpkt)+1If (base == nextseqnum)

        stop_timerelse

        start_timer

        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

        base=1nextseqnum=1

        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

        Λ

        3 Transport Layer 48Comp 361 Spring 2005

        GBN receiver extended FSM

        Wait

        udt_send(sndpkt)default

        rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

        expectedseqnum=1sndpkt =

        make_pkt(0ACKchksum)

        Λ

        If expected packet receivedSend ACK and deliver packet upstairs

        If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

        3 Transport Layer 49Comp 361 Spring 2005

        More on receiver

        The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

        3 Transport Layer 50Comp 361 Spring 2005

        GBN inaction

        GBN is easy to code but might have performance problems

        In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

        Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

        3 Transport Layer 51Comp 361 Spring 2005

        3 Transport Layer 52Comp 361 Spring 2005

        Selective Repeat

        receiver individually acknowledges all correctly received pkts

        buffers pkts as needed for eventual in-order delivery to upper layer

        sender only resends pkts for which ACK not received

        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

        3 Transport Layer 53Comp 361 Spring 2005

        Selective repeat sender receiver windows

        3 Transport Layer 54Comp 361 Spring 2005

        Selective repeat

        pkt n in [rcvbase rcvbase+N-1]

        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

        pkt n in [rcvbase-Nrcvbase-1]

        ACK(n) (note this is a reACK)

        otherwiseignore

        receiverdata from above

        if next available seq in window send pkt

        timeout(n)resend pkt n restart timer

        ACK(n) in [sendbasesendbase+N]

        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

        sender

        3 Transport Layer 55Comp 361 Spring 2005

        Selective repeat in action

        3 Transport Layer 56Comp 361 Spring 2005

        Selective repeatdilemma

        Example seq rsquos 0 1 2 3window size=3

        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

        Q what is relationship between seq size and window size

        3 Transport Layer 57Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 58Comp 361 Spring 2005

        TCP Overview RFCs 793 1122 1323 2018 2581

        full duplex databi-directional data flow in same connectionMSS maximum segment size

        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

        flow controlledsender will not overwhelm receiver

        point-to-pointone sender one receiver

        reliable in-order byte steam

        no ldquomessage boundariesrdquopipelined

        TCP congestion and flow control set window size

        send amp receive buffers

        socketdoor

        TCPsend buffer

        TCPreceive buffer

        socketdoor

        segment

        applicationwrites data

        applicationreads data

        3 Transport Layer 59Comp 361 Spring 2005

        More TCP DetailsMaximum Segment Size (MSS)

        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

        Application Data + TCP Header = TCP Segment

        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

        (again no payload)Client responds with third special segment

        This can contain payload

        3 Transport Layer 60Comp 361 Spring 2005

        Even More TCP Details

        A TCP connection between client and server creates in both client and server

        (i) buffers(ii) variables and

        (iii) a socket connection to process

        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

        any of the network elements between the host and server

        3 Transport Layer 61Comp 361 Spring 2005

        TCP segment structure

        source port dest port

        32 bits

        applicationdata

        (variable length)

        sequence numberacknowledgement number

        Receive windowUrg data pnterchecksum

        FSRPAUheadlen

        notused

        Options (variable length)

        URG urgent data (generally not used)

        ACK ACK valid

        PSH push data now(generally not used)

        RST SYN FINconnection estab(setup teardown

        commands)

        bytes rcvr willingto accept

        Internetchecksum

        (as in UDP)

        countingby bytes of data(not segments)

        3 Transport Layer 62Comp 361 Spring 2005

        TCP seq rsquos and ACKsSeq rsquos

        byte stream ldquonumberrdquo of first byte in segmentrsquos data

        ACKsseq of next byte expected from other sidecumulative ACK

        Q how receiver handles out-of-order segments

        A TCP spec doesnrsquot say - up to implementer

        Host BHost A

        Seq=42 ACK=79 data = lsquoCrsquo

        Seq=79 ACK=43 data = lsquoCrsquo

        Seq=43 ACK=80

        Usertypes

        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

        back lsquoCrsquo

        host ACKsreceipt

        of echoedlsquoCrsquo

        timesimple telnet scenario

        3 Transport Layer 63Comp 361 Spring 2005

        TCP Round Trip Time and Timeout

        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

        average several recent measurements not just current SampleRTT

        Q how to set TCP timeout valuelonger than RTT

        but RTT variestoo short premature timeout

        unnecessary retransmissions

        too long slow reaction to segment loss

        3 Transport Layer 64Comp 361 Spring 2005

        TCP Round Trip Time and Timeout

        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

        3 Transport Layer 65Comp 361 Spring 2005

        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

        100

        150

        200

        250

        300

        350

        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

        time (seconnds)

        RTT

        (mill

        iseco

        nds)

        SampleRTT Estimated RTT

        3 Transport Layer 66Comp 361 Spring 2005

        TCP Round Trip Time and Timeout

        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

        (typically β = 025)

        Then set timeout interval

        TimeoutInterval = EstimatedRTT + 4DevRTT

        3 Transport Layer 67Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 68Comp 361 Spring 2005

        TCP reliable data transfer

        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

        Retransmissions are triggered by

        timeout eventsduplicate acks

        Initially consider simplified TCP sender

        ignore duplicate acksignore flow control congestion control

        3 Transport Layer 69Comp 361 Spring 2005

        TCP sender eventsdata rcvd from app

        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

        timeoutretransmit segment that caused timeoutrestart timer

        Ack rcvdIf acknowledges previously unackedsegments

        update what is known to be ackedstart timer if there are outstanding segments

        TCP sender(simplified)

        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

        loop (forever) switch(event)

        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

        event timer timeoutretransmit not-yet-acknowledged segment with

        smallest sequence numberstart timer

        event ACK received with ACK field value of y if (y gt SendBase)

        SendBase = yif (there are currently not-yet-acknowledged segments)

        start timer

        end of loop forever

        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

        3 Transport Layer 70Comp 361 Spring 2005

        3 Transport Layer 71Comp 361 Spring 2005

        TCP retransmission scenariosHost A

        Seq=100 20 bytes data

        ACK=100

        timepremature timeout

        Host B

        Seq=92 8 bytes data

        ACK=120

        Seq=92 8 bytes data

        Seq=

        92 t

        imeo

        ut

        ACK=120

        Host A

        Seq=92 8 bytes data

        ACK=100

        loss

        tim

        eout

        lost ACK scenario

        Host B

        X

        Seq=92 8 bytes data

        ACK=100

        time

        SendBase= 120

        SendBase= 120

        Sendbase= 100

        Seq=

        92 t

        imeo

        utSendBase

        = 100

        3 Transport Layer 72Comp 361 Spring 2005

        TCP retransmission scenarios (more)Host A

        Seq=92 8 bytes data

        ACK=100

        loss

        tim

        eout

        Cumulative ACK scenario

        Host B

        X

        Seq=100 20 bytes data

        ACK=120

        time

        SendBase= 120

        3 Transport Layer 73Comp 361 Spring 2005

        TCP ACK generation [RFC 1122 RFC 2581]

        Event at Receiver

        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

        Arrival of in-order segment withexpected seq One other segment has ACK pending

        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

        Arrival of segment that partially or completely fills gap

        TCP Receiver action

        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

        Immediately send single cumulative ACK ACKing both in-order segments

        Immediately send duplicate ACK indicating seq of next expected byte

        Immediate send ACK provided thatsegment starts at lower end of gap

        3 Transport Layer 74Comp 361 Spring 2005

        More on Sender Policies

        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

        3 Transport Layer 75Comp 361 Spring 2005

        Fast Retransmit

        Time-out period often relatively long

        long delay before resending lost packet

        Detect lost segments via duplicate ACKs

        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

        fast retransmit resend segment before timer expires

        3 Transport Layer 76Comp 361 Spring 2005

        Fast retransmit algorithm

        event ACK received with ACK field value of y if (y gt SendBase)

        SendBase = yif (there are currently not-yet-acknowledged segments)

        start timer

        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

        resend segment with sequence number y

        a duplicate ACK for already ACKed segment

        fast retransmit

        3 Transport Layer 77Comp 361 Spring 2005

        TCP GBN or Selective Repeat

        Basic TCP looks a lot like GBN

        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

        This looks a lot like Selective Repeat

        TCP is a hybrid

        3 Transport Layer 78Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 79Comp 361 Spring 2005

        TCP Flow Control

        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

        3 Transport Layer 80Comp 361 Spring 2005

        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

        transmitting too muchtoo fast

        flow controlreceive side of TCP connection has a receive buffer

        speed-matching service matching the send rate to the receiving apprsquos drain rate

        app process may be slow at reading from buffer

        3 Transport Layer 81Comp 361 Spring 2005

        TCP segment structure

        source port dest port

        32 bits

        applicationdata

        (variable length)

        sequence numberacknowledgement number

        Receive windowUrg data pnterchecksum

        FSRPAUheadlen

        notused

        Options (variable length)

        URG urgent data (generally not used)

        ACK ACK valid

        PSH push data now(generally not used)

        RST SYN FINconnection estab(setup teardown

        commands)

        bytes rcvr willingto accept

        Internetchecksum

        (as in UDP)

        countingby bytes of data(not segments)

        3 Transport Layer 82Comp 361 Spring 2005

        TCP Flow control how it works

        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

        = RcvWindow= RcvBuffer-[LastByteRcvd -

        LastByteRead]

        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

        guarantees receive buffer doesnrsquot overflow

        3 Transport Layer 83Comp 361 Spring 2005

        Technical Issue

        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

        3 Transport Layer 84Comp 361 Spring 2005

        Note on UDP

        UDP has no flow control

        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

        3 Transport Layer 85Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 86Comp 361 Spring 2005

        TCP Connection Management

        Three way handshakeStep 1 client end system sends

        TCP SYN control segment to server

        specifies client_isn the initial seq No application data

        Step 2 server end system receives SYN replies with SYNACK control segment

        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

        seq sbuffers flow control info (eg RcvWindow)

        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

        3 Transport Layer 87Comp 361 Spring 2005

        TCP Connection Management (cont)

        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

        Allocate buffersAllocates buffersCan include application data

        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

        clientConnection request (SYN=1 seq=client_isn)

        server

        Connection granted (SYN=1 server_isn

        ACK (SYN=0 seq=client_isn+1)

        ack=client_isn+1)

        ack=server_isn+1

        3 Transport Layer 88Comp 361 Spring 2005

        TCP Connection Management (cont)

        Closing a connection

        client closes socketclientSocketclose()

        Step 1 client end system sends TCP FIN control segment to server

        Step 2 server receives FIN replies with ACK Closes connection sends FIN

        client

        FIN

        server

        ACK

        ACK

        FIN

        close

        close

        closed

        tim

        ed w

        ait

        3 Transport Layer 89Comp 361 Spring 2005

        TCP Connection Management (cont)

        Step 3 client receives FIN replies with ACK

        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

        Closes down after timed-wait

        Step 4 server receives ACK Connection closed

        Note with small modification can handle simultaneous FINs

        client

        FIN

        server

        ACK

        ACK

        FIN

        closing

        closing

        closed

        tim

        ed w

        ait

        closed

        3 Transport Layer 90Comp 361 Spring 2005

        TCP Connection Management (cont)

        ExampleTCP serverlifecycle

        Example TCP clientlifecycle

        3 Transport Layer 91Comp 361 Spring 2005

        A few special cases

        Have not discussed what happens if both client and server decide to close down connection at same time

        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

        3 Transport Layer 92Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 93Comp 361 Spring 2005

        Principles of Congestion Control

        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

        a top-10 problem

        3 Transport Layer 94Comp 361 Spring 2005

        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

        large delays when congestedmaximum achievable throughput

        3 Transport Layer 95Comp 361 Spring 2005

        Causescosts of congestion scenario 2

        one router finite buffers sender retransmission of lost packet

        3 Transport Layer 96Comp 361 Spring 2005

        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

        λin λout=

        λin λoutgtλ

        inλout

        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

        (c)(a) (b)

        3 Transport Layer 97Comp 361 Spring 2005

        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

        λin

        Q what happens as and increase λ

        in

        3 Transport Layer 98Comp 361 Spring 2005

        Causescosts of congestion scenario 3

        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

        3 Transport Layer 99Comp 361 Spring 2005

        Approaches towards congestion control

        Two broad approaches towards congestion control

        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

        Network-assisted congestion controlrouters provide feedback to end systems

        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

        3 Transport Layer 100Comp 361 Spring 2005

        Case study ATM ABR congestion control

        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

        RM cells returned to sender by receiver with bits intact

        small exception ndash see next page

        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

        sender should use available bandwidth

        if senderrsquos path congested sender throttled to minimum guaranteed rate

        3 Transport Layer 101Comp 361 Spring 2005

        Case study ATM ABR congestion control

        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

        3 Transport Layer 102Comp 361 Spring 2005

        Chapter 3 outline

        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

        35 Connection-oriented transport TCP

        segment structurereliable data transferflow controlconnection management

        36 Principles of congestion control37 TCP congestion control

        3 Transport Layer 103Comp 361 Spring 2005

        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

        Congwin

        w segments each with MSS bytes sent in one RTT

        throughput = w MSSRTT Bytessec

        3 Transport Layer 104Comp 361 Spring 2005

        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

        Tools are ldquosimilarrdquo to flow control sender limits transmission using

        LastByteSent-LastByteAcked le CongWin

        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

        3 Transport Layer 105Comp 361 Spring 2005

        TCP AIMDmultiplicative decrease additive increase increase

        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

        cut CongWin in half after loss event

        8 Kbytes

        16 Kbytes

        24 Kbytes

        time

        congestionwindow

        Long-lived TCP connection

        3 Transport Layer 106Comp 361 Spring 2005

        TCP Slow Start

        When connection begins CongWin = 1 MSS

        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

        available bandwidth may be gtgt MSSRTT

        desirable to quickly ramp up to respectable rate

        When connection begins increase rate exponentially fast until first loss event

        3 Transport Layer 107Comp 361 Spring 2005

        TCP Slow Start (more)

        When connection begins increase rate exponentially until first loss event

        double CongWin every RTTdone by incrementing CongWin for every ACK received

        Summary initial rate is slow but ramps up exponentially fast

        Host A

        one segment

        RTT

        Host B

        time

        two segments

        four segments

        3 Transport Layer 108Comp 361 Spring 2005

        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

        3 Transport Layer 109Comp 361 Spring 2005

        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

        3 Transport Layer 110Comp 361 Spring 2005

        Summary TCP Congestion Control

        When CongWin is below Threshold sender in slow-start phase window grows exponentially

        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

        3 Transport Layer 111Comp 361 Spring 2005

        The Big Picture

        3 Transport Layer 112Comp 361 Spring 2005

        TCP sender congestion controlEvent State TCP Sender Action Commentary

        ACK receipt for previously unackeddata

        Slow Start (SS)

        CongWin = CongWin + MSS If (CongWin gt Threshold)

        set state to ldquoCongestion Avoidancerdquo

        Resulting in a doubling of CongWin every RTT

        ACK receipt for previously unackeddata

        CongestionAvoidance (CA)

        CongWin = CongWin+MSS (MSSCongWin)

        Additive increase resulting in increase of CongWin by 1 MSS every RTT

        Loss event detected by triple duplicate ACK

        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

        Enter slow start

        Duplicate ACK

        SS or CA Increment duplicate ACK count for segment being acked

        CongWin and Threshold not changed

        3 Transport Layer 113Comp 361 Spring 2005

        TCP throughput

        Whatrsquos the average throughput of TCP as a function of window size and RTT

        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

        3 Transport Layer 114Comp 361 Spring 2005

        TCP Futures

        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

        L = 210-10 WowNew versions of TCP for high-speed needed

        LRTTMSSsdot221

        3 Transport Layer 115Comp 361 Spring 2005

        TCP FairnessFairness goal if K TCP sessions share same

        bottleneck link of bandwidth R each should have average rate of RK

        TCP connection 1

        bottleneckrouter

        capacity R

        TCP connection 2

        3 Transport Layer 116Comp 361 Spring 2005

        Why is TCP fairTwo competing sessions

        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

        R

        R

        equal bandwidth share

        Connection 1 throughput

        Conn

        ecti

        on 2

        thr

        ough

        p ut

        congestion avoidance additive increaseloss decrease window by factor of 2

        congestion avoidance additive increaseloss decrease window by factor of 2

        3 Transport Layer 117Comp 361 Spring 2005

        Fairness (more)Fairness and UDP

        Multimedia apps often do not use TCP

        do not want rate throttled by congestion control

        Instead use UDPpump audiovideo at constant rate tolerate packet loss

        Current Research area How to keep UDP from congesting the internet

        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

        3 Transport Layer 118Comp 361 Spring 2005

        TCP Latency ModelingNotation assumptions

        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

        modeling slow start

        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

        3 Transport Layer 119Comp 361 Spring 2005

        Fixed Congestion Window (W)Two cases

        1 WSR gt RTT + SR ACK for first segment in window returns before

        windowrsquos worth of data sentLatency = 2RTT + OR

        2 WSR lt RTT + SR ACK for first segment in window returns after

        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

        3 Transport Layer 120Comp 361 Spring 2005

        Fixed congestion window (1)

        First caseWSR gt RTT + SR ACK for

        first segment in window returns before windowrsquos worth of data sent

        latency = 2RTT + OR

        3 Transport Layer 121Comp 361 Spring 2005

        Fixed congestion window (2)

        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

        3 Transport Layer 122Comp 361 Spring 2005

        TCP Latency Modeling Slow Start (1)

        Now suppose window grows according to slow start(with no threshold and no loss events)

        Will show that the delay for one object is

        RS

        RSRTTP

        RORTTLatency P )12(2 minusminus⎥⎦

        ⎤⎢⎣⎡ +++=

        where P is the number of times TCP idles at server1min minus= KQP

        - where Q is the number of times the server idlesif the object were of infinite size

        - and K is the number of windows that cover the object

        3 Transport Layer 123Comp 361 Spring 2005

        TCP Latency Modeling Slow Start (2)

        RTT

        initiate TCPconnection

        requestobject

        first window= SR

        second window= 2SR

        third window= 4SR

        fourth window= 8SR

        completetransmissionobject

        delivered

        time atclient

        time atserver

        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

        Server idles P=2 times

        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

        Server idles P = minK-1Q times

        3 Transport Layer 124Comp 361 Spring 2005

        TCP Latency Modeling (3)

        ementacknowledg receivesserver until

        segment send tostartsserver whenfrom time=+ RTTRS

        RS

        RSRTTPRTT

        RO

        RSRTT

        RSRTT

        RO

        idleTimeRTTRO

        P

        kP

        k

        P

        pp

        )12(][2

        ]2[2

        2delay

        1

        1

        1

        minusminus+++=

        minus+++=

        ++=

        minus

        =

        =

        sum

        sum

        th window after the timeidle 2 1 kRSRTT

        RS k =⎥⎦

        ⎤⎢⎣⎡ minus+

        +minus

        window kth the transmit totime2 1 =minus

        RSk

        RTT

        initiate TCPconnection

        requestobject

        first window= SR

        second window= 2SR

        third window= 4SR

        fourth window= 8SR

        completetransmissionobject

        delivered

        time atclient

        time atserver

        3 Transport Layer 125Comp 361 Spring 2005

        TCP Latency Modeling (4)Recall K = number of windows that cover object

        How do we calculate K

        ⎥⎥⎤

        ⎢⎢⎡ +=

        +ge=

        geminus=

        ge+++=

        ge+++=minus

        minus

        )1(log

        )1(logmin

        12min

        222min222min

        2

        2

        110

        110

        SO

        SOkk

        SOk

        SOkOSSSkK

        k

        k

        k

        L

        L

        Calculation of Q number of idles for infinite-size objectis similar

        3 Transport Layer 126Comp 361 Spring 2005

        HTTP ModelingAssume Web page consists of

        1 base HTML page (of size O bits)M images (each of size O bits)

        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

        3 Transport Layer 127Comp 361 Spring 2005

        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

        02468

        101214161820

        28Kbps

        100Kbps

        1 Mbps 10Mbps

        non-persistent

        persistent

        parallel non-persistent

        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

        3 Transport Layer 128Comp 361 Spring 2005

        HTTP Response time (in seconds)

        0

        10

        20

        30

        40

        50

        60

        70

        28Kbps

        100Kbps

        1 Mbps 10Mbps

        non-persistent

        persistent

        parallel non-persistent

        RTT =1 sec O = 5 Kbytes M=10 and X=5

        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

        3 Transport Layer 129Comp 361 Spring 2005

        Chapter 3 Summaryprinciples behind transport layer services

        multiplexing demultiplexingreliable data transferflow controlcongestion control

        instantiation and implementation in the Internet

        UDPTCP

        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

        • Chapter 3 Transport Layer last revised 160305
        • Chapter 3 outline
        • Transport services and protocols
        • Transport vs network layer
        • Transport-layer protocols
        • Chapter 3 outline
        • Multiplexingdemultiplexing
        • Multiplexingdemultiplexing
        • How demultiplexing works
        • Connectionless demultiplexing
        • Connectionless demux (cont)
        • Connection-oriented demux
        • Connection-oriented demux (cont)
        • Connection-oriented demux Threaded Web Server
        • Chapter 3 outline
        • UDP User Datagram Protocol [RFC 768]
        • UDP more
        • UDP checksum
        • Chapter 3 outline
        • Principles of Reliable data transfer
        • Reliable data transfer getting started
        • Reliable data transfer getting started
        • Incremental Improvements
        • Rdt10 reliable transfer over a reliable channel
        • Rdt20 channel with bit errors
        • rdt20 FSM specification
        • rdt20 operation with no errors
        • rdt20 error scenario
        • rdt20 has a fatal flaw
        • rdt21 sender handles garbled ACKNAKs
        • rdt21 receiver handles garbled ACKNAKs
        • rdt21 discussion
        • rdt22 a NAK-free protocol
        • rdt22 sender receiver fragments
        • rdt30 channels with errors and loss
        • rdt30 sender
        • rdt30 in action
        • rdt30 in action
        • Performance of rdt30
        • rdt30 stop-and-wait operation
        • Pipelined protocols
        • Pipelined protocols
        • Pipelining increased utilization
        • Go-Back-N
        • GBN Sender
        • GBN sender extended FSM
        • GBN receiver extended FSM
        • More on receiver
        • GBN inaction
        • Selective Repeat
        • Selective repeat sender receiver windows
        • Selective repeat
        • Selective repeat in action
        • Selective repeat dilemma
        • Chapter 3 outline
        • TCP Overview RFCs 793 1122 1323 2018 2581
        • More TCP Details
        • Even More TCP Details
        • TCP segment structure
        • TCP seq rsquos and ACKs
        • TCP Round Trip Time and Timeout
        • TCP Round Trip Time and Timeout
        • Example RTT estimation
        • TCP Round Trip Time and Timeout
        • Chapter 3 outline
        • TCP reliable data transfer
        • TCP sender events
        • TCP sender(simplified)
        • TCP retransmission scenarios
        • TCP retransmission scenarios (more)
        • TCP ACK generation [RFC 1122 RFC 2581]
        • More on Sender Policies
        • Fast Retransmit
        • Fast retransmit algorithm
        • TCP GBN or Selective Repeat
        • Chapter 3 outline
        • TCP Flow Control
        • TCP Flow Control
        • TCP segment structure
        • TCP Flow control how it works
        • Technical Issue
        • Chapter 3 outline
        • TCP Connection Management
        • TCP Connection Management (cont)
        • TCP Connection Management (cont)
        • TCP Connection Management (cont)
        • TCP Connection Management (cont)
        • A few special cases
        • Chapter 3 outline
        • Principles of Congestion Control
        • Causescosts of congestion scenario 1
        • Causescosts of congestion scenario 2
        • Causescosts of congestion scenario 3
        • Causescosts of congestion scenario 3
        • Approaches towards congestion control
        • Case study ATM ABR congestion control
        • Case study ATM ABR congestion control
        • Chapter 3 outline
        • TCP Congestion Control
        • TCP AIMD
        • TCP Slow Start
        • TCP Slow Start (more)
        • Summary TCP Congestion Control
        • The Big Picture
        • TCP sender congestion control
        • TCP throughput
        • TCP Futures
        • TCP Fairness
        • Why is TCP fair
        • Fairness (more)
        • TCP Latency Modeling
        • Fixed Congestion Window (W)
        • Fixed congestion window (1)
        • Fixed congestion window (2)
        • TCP Latency Modeling Slow Start (1)
        • TCP Latency Modeling Slow Start (2)
        • TCP Latency Modeling (3)
        • TCP Latency Modeling (4)
        • HTTP Modeling
        • Chapter 3 Summary

          3 Transport Layer 5Comp 361 Spring 2005

          Transport-layer protocols

          Internet transport servicesreliable in-order unicastdelivery (TCP)

          congestion flow controlconnection setup

          unreliable (ldquobest-effortrdquo) unordered unicast or multicast delivery UDPservices not available

          real-timebandwidth guaranteesreliable multicast

          applicationtransportnetworkdata linkphysical

          applicationtransportnetworkdata linkphysical

          networkdata linkphysical

          networkdata linkphysical

          networkdata linkphysical

          networkdata linkphysicalnetwork

          data linkphysical

          logical end-end transport

          3 Transport Layer 6Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 7Comp 361 Spring 2005

          Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

          Multiplexing at send host

          delivering received segmentsto correct socket

          Demultiplexing at rcv host

          = socket = process

          application

          transport

          network

          link

          physical

          P1 application

          transport

          network

          link

          physical

          application

          transport

          network

          link

          physical

          P2P3 P4P1

          host 1 host 2 host 3

          3 Transport Layer 8Comp 361 Spring 2005

          Multiplexingdemultiplexingsegment - unit of data

          exchanged between transport layer entities

          aka TPDU transport protocol data unit

          Demultiplexing delivering received segments to correct app layer processes

          receiver

          applicationtransportnetwork

          M P2applicationtransportnetwork

          HtHn segment

          segment Mapplicationtransportnetwork

          P1M

          M MP3 P4

          segmentheader

          application-layerdata

          3 Transport Layer 9Comp 361 Spring 2005

          How demultiplexing workshost receives IP datagrams

          each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

          host uses IP addresses amp port numbers to direct segment to appropriate socket

          source port dest port

          32 bits

          applicationdata

          (message)

          other header fields

          TCPUDP segment format

          3 Transport Layer 10Comp 361 Spring 2005

          Connectionless demultiplexingWhen host receives UDP segment

          checks destination port number in segmentdirects UDP segment to socket with that port number

          IP datagrams with different source IP addresses andor source port numbers directed to same socket

          Create sockets with port numbers

          DatagramSocket mySocket1 = new DatagramSocket(99111)

          DatagramSocket mySocket2 = new DatagramSocket(99222)

          UDP socket identified by two-tuple

          (dest IP address dest port number)

          3 Transport Layer 11Comp 361 Spring 2005

          Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

          ClientIPB

          P3

          clientIP A

          P1P1P3

          serverIP C

          SP 6428DP 9157

          SP 9157DP 6428

          SP 6428DP 5775

          SP 5775DP 6428

          SP provides ldquoreturn addressrdquo

          3 Transport Layer 12Comp 361 Spring 2005

          Connection-oriented demux

          TCP socket identified by 4-tuple

          source IP addresssource port numberdest IP addressdest port number

          recv host uses all four values to direct segment to appropriate socket

          Server host may support many simultaneous TCP sockets

          each socket identified by its own 4-tuple

          Web servers have different sockets for each connecting client

          non-persistent HTTP will have different socket for each request

          3 Transport Layer 13Comp 361 Spring 2005

          Connection-oriented demux(cont)

          ClientIPB

          P3

          clientIP A

          P1P1P3

          serverIP C

          SP 80DP 9157

          SP 9157DP 80

          SP 80DP 5775

          SP 5775DP 80

          P4

          3 Transport Layer 14Comp 361 Spring 2005

          Connection-oriented demux Threaded Web Server

          ClientIPB

          P1

          clientIP A

          P1P2

          serverIP C

          SP 9157DP 80

          SP 9157DP 80

          P4 P3

          D-IPCS-IP AD-IPC

          S-IP B

          SP 5775DP 80

          D-IPCS-IP B

          3 Transport Layer 15Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 16Comp 361 Spring 2005

          UDP User Datagram Protocol [RFC 768]

          ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

          lostdelivered out of order to app

          connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

          Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

          3 Transport Layer 17Comp 361 Spring 2005

          UDP moreoften used for streaming multimedia apps

          loss tolerantrate sensitive

          other UDP uses (why)

          DNS small delaySNMP stressful cond

          reliable transfer over UDP add reliability at application layer

          application-specific error recover

          source port dest port

          32 bits

          Applicationdata

          (message)

          length checksumLength in

          bytes of UDPsegmentincluding

          header

          UDP segment format

          3 Transport Layer 18Comp 361 Spring 2005

          UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

          segment

          Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

          NO - error detectedYES - no error detected But maybe errors nonetheless More later

          Receiver may choose to discard segment or send a warning to app in case error

          Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

          3 Transport Layer 19Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 20Comp 361 Spring 2005

          Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

          characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

          3 Transport Layer 21Comp 361 Spring 2005

          Reliable data transfer getting started

          sendside

          receiveside

          rdt_send() called from above (eg by app) Passed data to

          deliver to receiver upper layer

          udt_send() called by rdtto transfer packet over

          unreliable channel to receiver

          rdt_rcv() called when packet arrives on rcv-side of channel

          deliver_data() called by rdt to deliver data to upper

          3 Transport Layer 22Comp 361 Spring 2005

          Reliable data transfer getting startedWersquoll

          incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

          but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

          state1

          state2

          event causing state transitionactions taken on state transition

          state when in this ldquostaterdquo next state

          uniquely determined by next event

          eventactions

          3 Transport Layer 23Comp 361 Spring 2005

          Incremental Improvements

          rdt10 assumes every packet sent arrives and no errors introduced in transmission

          rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

          rdt21 deals with corrupted ACKSNAKS

          rdt22 like rdt21 but does not need NAKs

          Rdt30 Allows packets to be lost

          Rdt10 reliable transfer over a reliable channel

          underlying channel perfectly reliableno bit errorsno loss of packets

          separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

          Wait for call from above packet = make_pkt(data)

          udt_send(packet)

          rdt_send(data)extract (packetdata)deliver_data(data)

          Wait for call from

          below

          rdt_rcv(packet)

          sender receiver

          3 Transport Layer 24Comp 361 Spring 2005

          3 Transport Layer 25Comp 361 Spring 2005

          Rdt20 channel with bit errors

          underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

          the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

          new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

          3 Transport Layer 26Comp 361 Spring 2005

          rdt20 FSM specification

          Wait for call from above

          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

          udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

          udt_send(NAK)

          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

          Wait for ACK or

          NAK

          rdt_send(data)

          receiver

          Wait for call from

          below

          Λ

          sender

          3 Transport Layer 27Comp 361 Spring 2005

          rdt20 operation with no errors

          Wait for call from above

          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

          udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

          udt_send(NAK)

          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

          Wait for ACK or

          NAK

          Wait for call from

          below

          rdt_send(data)

          Λ

          3 Transport Layer 28Comp 361 Spring 2005

          rdt20 error scenario

          Wait for call from above

          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

          udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

          udt_send(NAK)

          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

          Wait for ACK or

          NAK

          Wait for call from

          below

          rdt_send(data)

          Λ

          3 Transport Layer 29Comp 361 Spring 2005

          rdt20 has a fatal flawWhat happens if ACKNAK

          corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

          What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

          Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

          Sender sends one packet then waits for receiver response

          stop and wait

          3 Transport Layer 30Comp 361 Spring 2005

          Sender whenever sender receives control message it sends a packet to receiver

          A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

          Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

          Note ACKNAK do not contain sequence

          3 Transport Layer 31Comp 361 Spring 2005

          rdt21 sender handles garbled ACKNAKs

          Wait for call 0 from

          above

          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

          rdt_send(data)

          Wait for ACK or NAK 0 udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

          rdt_send(data)

          udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

          Wait forcall 1 from

          above

          Wait for ACK or NAK 1

          ΛΛ

          3 Transport Layer 32Comp 361 Spring 2005

          rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

          ampamp has_seq0(rcvpkt)

          Wait for 0 from below

          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

          Wait for 1 from below

          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

          3 Transport Layer 33Comp 361 Spring 2005

          rdt21 discussion

          Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

          state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

          Receivermust check if received packet is duplicate

          state indicates whether 0 or 1 is expected pkt seq

          note receiver can notknow if its last ACKNAK received OK at sender

          3 Transport Layer 34Comp 361 Spring 2005

          rdt22 a NAK-free protocol

          same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

          receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

          duplicate ACK at sender results in same action as NAK retransmit current pkt

          3 Transport Layer 35Comp 361 Spring 2005

          rdt22 sender receiver fragments

          Wait for call 0 from

          above

          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

          rdt_send(data)

          udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

          isACK(rcvpkt1) )

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

          Wait for ACK

          0sender FSM

          fragment

          Wait for 0 from below

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

          has_seq1(rcvpkt))

          udt_send(sndpkt)receiver FSM

          fragment

          Λ

          3 Transport Layer 36Comp 361 Spring 2005

          rdt30 channels with errors and loss

          New assumptionunderlying channel can also lose packets (data or ACKs)

          checksum seq ACKs retransmissions will be of help but not enough

          Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

          Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

          retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

          requires countdown timer

          3 Transport Layer 37Comp 361 Spring 2005

          rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

          rdt_send(data)

          Wait for

          ACK0

          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

          Wait for call 1 from

          above

          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

          rdt_send(data)

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

          stop_timerstop_timer

          udt_send(sndpkt)start_timer

          timeout

          udt_send(sndpkt)start_timer

          timeout

          rdt_rcv(rcvpkt)

          Wait for call 0from

          above

          Wait for

          ACK1

          Λrdt_rcv(rcvpkt)

          ΛΛ

          Λ

          3 Transport Layer 38Comp 361 Spring 2005

          rdt30 in action

          3 Transport Layer 39Comp 361 Spring 2005

          rdt30 in action

          3 Transport Layer 40Comp 361 Spring 2005

          Performance of rdt30

          rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

          L (packet length in bits)R (transmission rate bps)

          8kbpkt109 bsec

          Ttransmit = = = 8 microsec

          U sender =

          00830008

          = 000027 L R RTT + L R

          =

          U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

          rdt30 stop-and-wait operation

          first packet bit transmitted t = 0

          sender receiver

          RTT

          last packet bit transmitted t = L R

          first packet bit arriveslast packet bit arrives send ACK

          ACK arrives send next packet t = RTT + L R

          U sender =

          008 30008

          = 000027 L R RTT + L R

          =

          3 Transport Layer 41Comp 361 Spring 2005

          3 Transport Layer 42Comp 361 Spring 2005

          Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

          range of sequence numbers must be increasedbuffering at sender andor receiver

          3 Transport Layer 43Comp 361 Spring 2005

          Pipelined protocols

          Advantage much better bandwidth utilization than stop-and-wait

          Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

          Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

          Note TCP is not exactly either

          Pipelining increased utilization

          first packet bit transmitted t = 0

          sender receiver

          RTT

          last bit transmitted t = L R

          first packet bit arriveslast packet bit arrives send ACK

          ACK arrives send next packet t = RTT + L R

          last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

          U sender =

          02430008

          = 00008 3 L R RTT + L R

          =

          Increase utilizationby a factor of 3

          3 Transport Layer 44Comp 361 Spring 2005

          3 Transport Layer 45Comp 361 Spring 2005

          Go-Back-NSender

          k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

          ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

          Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

          3 Transport Layer 46Comp 361 Spring 2005

          GBN Sender

          rdt_Send() called checks to see if window is full No send out packetYes return data to application level

          Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

          Timeout resends ALL packets that have been sent but not yet acknowledged

          This is only event that triggers resend

          3 Transport Layer 47Comp 361 Spring 2005

          GBN sender extended FSMrdt_send(data)

          Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

          timeout

          if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

          start_timernextseqnum++

          elserefuse_data(data)

          base = getacknum(rcvpkt)+1If (base == nextseqnum)

          stop_timerelse

          start_timer

          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

          base=1nextseqnum=1

          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

          Λ

          3 Transport Layer 48Comp 361 Spring 2005

          GBN receiver extended FSM

          Wait

          udt_send(sndpkt)default

          rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

          expectedseqnum=1sndpkt =

          make_pkt(0ACKchksum)

          Λ

          If expected packet receivedSend ACK and deliver packet upstairs

          If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

          3 Transport Layer 49Comp 361 Spring 2005

          More on receiver

          The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

          3 Transport Layer 50Comp 361 Spring 2005

          GBN inaction

          GBN is easy to code but might have performance problems

          In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

          Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

          3 Transport Layer 51Comp 361 Spring 2005

          3 Transport Layer 52Comp 361 Spring 2005

          Selective Repeat

          receiver individually acknowledges all correctly received pkts

          buffers pkts as needed for eventual in-order delivery to upper layer

          sender only resends pkts for which ACK not received

          sender timer for each unACKed pktCompare to GBN which only had timer for base packet

          sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

          3 Transport Layer 53Comp 361 Spring 2005

          Selective repeat sender receiver windows

          3 Transport Layer 54Comp 361 Spring 2005

          Selective repeat

          pkt n in [rcvbase rcvbase+N-1]

          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

          pkt n in [rcvbase-Nrcvbase-1]

          ACK(n) (note this is a reACK)

          otherwiseignore

          receiverdata from above

          if next available seq in window send pkt

          timeout(n)resend pkt n restart timer

          ACK(n) in [sendbasesendbase+N]

          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

          sender

          3 Transport Layer 55Comp 361 Spring 2005

          Selective repeat in action

          3 Transport Layer 56Comp 361 Spring 2005

          Selective repeatdilemma

          Example seq rsquos 0 1 2 3window size=3

          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

          Q what is relationship between seq size and window size

          3 Transport Layer 57Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 58Comp 361 Spring 2005

          TCP Overview RFCs 793 1122 1323 2018 2581

          full duplex databi-directional data flow in same connectionMSS maximum segment size

          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

          flow controlledsender will not overwhelm receiver

          point-to-pointone sender one receiver

          reliable in-order byte steam

          no ldquomessage boundariesrdquopipelined

          TCP congestion and flow control set window size

          send amp receive buffers

          socketdoor

          TCPsend buffer

          TCPreceive buffer

          socketdoor

          segment

          applicationwrites data

          applicationreads data

          3 Transport Layer 59Comp 361 Spring 2005

          More TCP DetailsMaximum Segment Size (MSS)

          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

          Application Data + TCP Header = TCP Segment

          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

          (again no payload)Client responds with third special segment

          This can contain payload

          3 Transport Layer 60Comp 361 Spring 2005

          Even More TCP Details

          A TCP connection between client and server creates in both client and server

          (i) buffers(ii) variables and

          (iii) a socket connection to process

          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

          any of the network elements between the host and server

          3 Transport Layer 61Comp 361 Spring 2005

          TCP segment structure

          source port dest port

          32 bits

          applicationdata

          (variable length)

          sequence numberacknowledgement number

          Receive windowUrg data pnterchecksum

          FSRPAUheadlen

          notused

          Options (variable length)

          URG urgent data (generally not used)

          ACK ACK valid

          PSH push data now(generally not used)

          RST SYN FINconnection estab(setup teardown

          commands)

          bytes rcvr willingto accept

          Internetchecksum

          (as in UDP)

          countingby bytes of data(not segments)

          3 Transport Layer 62Comp 361 Spring 2005

          TCP seq rsquos and ACKsSeq rsquos

          byte stream ldquonumberrdquo of first byte in segmentrsquos data

          ACKsseq of next byte expected from other sidecumulative ACK

          Q how receiver handles out-of-order segments

          A TCP spec doesnrsquot say - up to implementer

          Host BHost A

          Seq=42 ACK=79 data = lsquoCrsquo

          Seq=79 ACK=43 data = lsquoCrsquo

          Seq=43 ACK=80

          Usertypes

          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

          back lsquoCrsquo

          host ACKsreceipt

          of echoedlsquoCrsquo

          timesimple telnet scenario

          3 Transport Layer 63Comp 361 Spring 2005

          TCP Round Trip Time and Timeout

          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

          average several recent measurements not just current SampleRTT

          Q how to set TCP timeout valuelonger than RTT

          but RTT variestoo short premature timeout

          unnecessary retransmissions

          too long slow reaction to segment loss

          3 Transport Layer 64Comp 361 Spring 2005

          TCP Round Trip Time and Timeout

          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

          3 Transport Layer 65Comp 361 Spring 2005

          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

          100

          150

          200

          250

          300

          350

          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

          time (seconnds)

          RTT

          (mill

          iseco

          nds)

          SampleRTT Estimated RTT

          3 Transport Layer 66Comp 361 Spring 2005

          TCP Round Trip Time and Timeout

          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

          (typically β = 025)

          Then set timeout interval

          TimeoutInterval = EstimatedRTT + 4DevRTT

          3 Transport Layer 67Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 68Comp 361 Spring 2005

          TCP reliable data transfer

          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

          Retransmissions are triggered by

          timeout eventsduplicate acks

          Initially consider simplified TCP sender

          ignore duplicate acksignore flow control congestion control

          3 Transport Layer 69Comp 361 Spring 2005

          TCP sender eventsdata rcvd from app

          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

          timeoutretransmit segment that caused timeoutrestart timer

          Ack rcvdIf acknowledges previously unackedsegments

          update what is known to be ackedstart timer if there are outstanding segments

          TCP sender(simplified)

          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

          loop (forever) switch(event)

          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

          event timer timeoutretransmit not-yet-acknowledged segment with

          smallest sequence numberstart timer

          event ACK received with ACK field value of y if (y gt SendBase)

          SendBase = yif (there are currently not-yet-acknowledged segments)

          start timer

          end of loop forever

          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

          3 Transport Layer 70Comp 361 Spring 2005

          3 Transport Layer 71Comp 361 Spring 2005

          TCP retransmission scenariosHost A

          Seq=100 20 bytes data

          ACK=100

          timepremature timeout

          Host B

          Seq=92 8 bytes data

          ACK=120

          Seq=92 8 bytes data

          Seq=

          92 t

          imeo

          ut

          ACK=120

          Host A

          Seq=92 8 bytes data

          ACK=100

          loss

          tim

          eout

          lost ACK scenario

          Host B

          X

          Seq=92 8 bytes data

          ACK=100

          time

          SendBase= 120

          SendBase= 120

          Sendbase= 100

          Seq=

          92 t

          imeo

          utSendBase

          = 100

          3 Transport Layer 72Comp 361 Spring 2005

          TCP retransmission scenarios (more)Host A

          Seq=92 8 bytes data

          ACK=100

          loss

          tim

          eout

          Cumulative ACK scenario

          Host B

          X

          Seq=100 20 bytes data

          ACK=120

          time

          SendBase= 120

          3 Transport Layer 73Comp 361 Spring 2005

          TCP ACK generation [RFC 1122 RFC 2581]

          Event at Receiver

          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

          Arrival of in-order segment withexpected seq One other segment has ACK pending

          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

          Arrival of segment that partially or completely fills gap

          TCP Receiver action

          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

          Immediately send single cumulative ACK ACKing both in-order segments

          Immediately send duplicate ACK indicating seq of next expected byte

          Immediate send ACK provided thatsegment starts at lower end of gap

          3 Transport Layer 74Comp 361 Spring 2005

          More on Sender Policies

          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

          3 Transport Layer 75Comp 361 Spring 2005

          Fast Retransmit

          Time-out period often relatively long

          long delay before resending lost packet

          Detect lost segments via duplicate ACKs

          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

          fast retransmit resend segment before timer expires

          3 Transport Layer 76Comp 361 Spring 2005

          Fast retransmit algorithm

          event ACK received with ACK field value of y if (y gt SendBase)

          SendBase = yif (there are currently not-yet-acknowledged segments)

          start timer

          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

          resend segment with sequence number y

          a duplicate ACK for already ACKed segment

          fast retransmit

          3 Transport Layer 77Comp 361 Spring 2005

          TCP GBN or Selective Repeat

          Basic TCP looks a lot like GBN

          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

          This looks a lot like Selective Repeat

          TCP is a hybrid

          3 Transport Layer 78Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 79Comp 361 Spring 2005

          TCP Flow Control

          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

          3 Transport Layer 80Comp 361 Spring 2005

          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

          transmitting too muchtoo fast

          flow controlreceive side of TCP connection has a receive buffer

          speed-matching service matching the send rate to the receiving apprsquos drain rate

          app process may be slow at reading from buffer

          3 Transport Layer 81Comp 361 Spring 2005

          TCP segment structure

          source port dest port

          32 bits

          applicationdata

          (variable length)

          sequence numberacknowledgement number

          Receive windowUrg data pnterchecksum

          FSRPAUheadlen

          notused

          Options (variable length)

          URG urgent data (generally not used)

          ACK ACK valid

          PSH push data now(generally not used)

          RST SYN FINconnection estab(setup teardown

          commands)

          bytes rcvr willingto accept

          Internetchecksum

          (as in UDP)

          countingby bytes of data(not segments)

          3 Transport Layer 82Comp 361 Spring 2005

          TCP Flow control how it works

          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

          = RcvWindow= RcvBuffer-[LastByteRcvd -

          LastByteRead]

          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

          guarantees receive buffer doesnrsquot overflow

          3 Transport Layer 83Comp 361 Spring 2005

          Technical Issue

          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

          3 Transport Layer 84Comp 361 Spring 2005

          Note on UDP

          UDP has no flow control

          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

          3 Transport Layer 85Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 86Comp 361 Spring 2005

          TCP Connection Management

          Three way handshakeStep 1 client end system sends

          TCP SYN control segment to server

          specifies client_isn the initial seq No application data

          Step 2 server end system receives SYN replies with SYNACK control segment

          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

          seq sbuffers flow control info (eg RcvWindow)

          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

          3 Transport Layer 87Comp 361 Spring 2005

          TCP Connection Management (cont)

          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

          Allocate buffersAllocates buffersCan include application data

          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

          clientConnection request (SYN=1 seq=client_isn)

          server

          Connection granted (SYN=1 server_isn

          ACK (SYN=0 seq=client_isn+1)

          ack=client_isn+1)

          ack=server_isn+1

          3 Transport Layer 88Comp 361 Spring 2005

          TCP Connection Management (cont)

          Closing a connection

          client closes socketclientSocketclose()

          Step 1 client end system sends TCP FIN control segment to server

          Step 2 server receives FIN replies with ACK Closes connection sends FIN

          client

          FIN

          server

          ACK

          ACK

          FIN

          close

          close

          closed

          tim

          ed w

          ait

          3 Transport Layer 89Comp 361 Spring 2005

          TCP Connection Management (cont)

          Step 3 client receives FIN replies with ACK

          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

          Closes down after timed-wait

          Step 4 server receives ACK Connection closed

          Note with small modification can handle simultaneous FINs

          client

          FIN

          server

          ACK

          ACK

          FIN

          closing

          closing

          closed

          tim

          ed w

          ait

          closed

          3 Transport Layer 90Comp 361 Spring 2005

          TCP Connection Management (cont)

          ExampleTCP serverlifecycle

          Example TCP clientlifecycle

          3 Transport Layer 91Comp 361 Spring 2005

          A few special cases

          Have not discussed what happens if both client and server decide to close down connection at same time

          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

          3 Transport Layer 92Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 93Comp 361 Spring 2005

          Principles of Congestion Control

          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

          a top-10 problem

          3 Transport Layer 94Comp 361 Spring 2005

          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

          large delays when congestedmaximum achievable throughput

          3 Transport Layer 95Comp 361 Spring 2005

          Causescosts of congestion scenario 2

          one router finite buffers sender retransmission of lost packet

          3 Transport Layer 96Comp 361 Spring 2005

          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

          λin λout=

          λin λoutgtλ

          inλout

          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

          (c)(a) (b)

          3 Transport Layer 97Comp 361 Spring 2005

          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

          λin

          Q what happens as and increase λ

          in

          3 Transport Layer 98Comp 361 Spring 2005

          Causescosts of congestion scenario 3

          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

          3 Transport Layer 99Comp 361 Spring 2005

          Approaches towards congestion control

          Two broad approaches towards congestion control

          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

          Network-assisted congestion controlrouters provide feedback to end systems

          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

          3 Transport Layer 100Comp 361 Spring 2005

          Case study ATM ABR congestion control

          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

          RM cells returned to sender by receiver with bits intact

          small exception ndash see next page

          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

          sender should use available bandwidth

          if senderrsquos path congested sender throttled to minimum guaranteed rate

          3 Transport Layer 101Comp 361 Spring 2005

          Case study ATM ABR congestion control

          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

          3 Transport Layer 102Comp 361 Spring 2005

          Chapter 3 outline

          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

          35 Connection-oriented transport TCP

          segment structurereliable data transferflow controlconnection management

          36 Principles of congestion control37 TCP congestion control

          3 Transport Layer 103Comp 361 Spring 2005

          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

          Congwin

          w segments each with MSS bytes sent in one RTT

          throughput = w MSSRTT Bytessec

          3 Transport Layer 104Comp 361 Spring 2005

          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

          Tools are ldquosimilarrdquo to flow control sender limits transmission using

          LastByteSent-LastByteAcked le CongWin

          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

          3 Transport Layer 105Comp 361 Spring 2005

          TCP AIMDmultiplicative decrease additive increase increase

          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

          cut CongWin in half after loss event

          8 Kbytes

          16 Kbytes

          24 Kbytes

          time

          congestionwindow

          Long-lived TCP connection

          3 Transport Layer 106Comp 361 Spring 2005

          TCP Slow Start

          When connection begins CongWin = 1 MSS

          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

          available bandwidth may be gtgt MSSRTT

          desirable to quickly ramp up to respectable rate

          When connection begins increase rate exponentially fast until first loss event

          3 Transport Layer 107Comp 361 Spring 2005

          TCP Slow Start (more)

          When connection begins increase rate exponentially until first loss event

          double CongWin every RTTdone by incrementing CongWin for every ACK received

          Summary initial rate is slow but ramps up exponentially fast

          Host A

          one segment

          RTT

          Host B

          time

          two segments

          four segments

          3 Transport Layer 108Comp 361 Spring 2005

          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

          3 Transport Layer 109Comp 361 Spring 2005

          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

          3 Transport Layer 110Comp 361 Spring 2005

          Summary TCP Congestion Control

          When CongWin is below Threshold sender in slow-start phase window grows exponentially

          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

          3 Transport Layer 111Comp 361 Spring 2005

          The Big Picture

          3 Transport Layer 112Comp 361 Spring 2005

          TCP sender congestion controlEvent State TCP Sender Action Commentary

          ACK receipt for previously unackeddata

          Slow Start (SS)

          CongWin = CongWin + MSS If (CongWin gt Threshold)

          set state to ldquoCongestion Avoidancerdquo

          Resulting in a doubling of CongWin every RTT

          ACK receipt for previously unackeddata

          CongestionAvoidance (CA)

          CongWin = CongWin+MSS (MSSCongWin)

          Additive increase resulting in increase of CongWin by 1 MSS every RTT

          Loss event detected by triple duplicate ACK

          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

          Enter slow start

          Duplicate ACK

          SS or CA Increment duplicate ACK count for segment being acked

          CongWin and Threshold not changed

          3 Transport Layer 113Comp 361 Spring 2005

          TCP throughput

          Whatrsquos the average throughput of TCP as a function of window size and RTT

          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

          3 Transport Layer 114Comp 361 Spring 2005

          TCP Futures

          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

          L = 210-10 WowNew versions of TCP for high-speed needed

          LRTTMSSsdot221

          3 Transport Layer 115Comp 361 Spring 2005

          TCP FairnessFairness goal if K TCP sessions share same

          bottleneck link of bandwidth R each should have average rate of RK

          TCP connection 1

          bottleneckrouter

          capacity R

          TCP connection 2

          3 Transport Layer 116Comp 361 Spring 2005

          Why is TCP fairTwo competing sessions

          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

          R

          R

          equal bandwidth share

          Connection 1 throughput

          Conn

          ecti

          on 2

          thr

          ough

          p ut

          congestion avoidance additive increaseloss decrease window by factor of 2

          congestion avoidance additive increaseloss decrease window by factor of 2

          3 Transport Layer 117Comp 361 Spring 2005

          Fairness (more)Fairness and UDP

          Multimedia apps often do not use TCP

          do not want rate throttled by congestion control

          Instead use UDPpump audiovideo at constant rate tolerate packet loss

          Current Research area How to keep UDP from congesting the internet

          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

          3 Transport Layer 118Comp 361 Spring 2005

          TCP Latency ModelingNotation assumptions

          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

          modeling slow start

          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

          3 Transport Layer 119Comp 361 Spring 2005

          Fixed Congestion Window (W)Two cases

          1 WSR gt RTT + SR ACK for first segment in window returns before

          windowrsquos worth of data sentLatency = 2RTT + OR

          2 WSR lt RTT + SR ACK for first segment in window returns after

          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

          3 Transport Layer 120Comp 361 Spring 2005

          Fixed congestion window (1)

          First caseWSR gt RTT + SR ACK for

          first segment in window returns before windowrsquos worth of data sent

          latency = 2RTT + OR

          3 Transport Layer 121Comp 361 Spring 2005

          Fixed congestion window (2)

          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

          3 Transport Layer 122Comp 361 Spring 2005

          TCP Latency Modeling Slow Start (1)

          Now suppose window grows according to slow start(with no threshold and no loss events)

          Will show that the delay for one object is

          RS

          RSRTTP

          RORTTLatency P )12(2 minusminus⎥⎦

          ⎤⎢⎣⎡ +++=

          where P is the number of times TCP idles at server1min minus= KQP

          - where Q is the number of times the server idlesif the object were of infinite size

          - and K is the number of windows that cover the object

          3 Transport Layer 123Comp 361 Spring 2005

          TCP Latency Modeling Slow Start (2)

          RTT

          initiate TCPconnection

          requestobject

          first window= SR

          second window= 2SR

          third window= 4SR

          fourth window= 8SR

          completetransmissionobject

          delivered

          time atclient

          time atserver

          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

          Server idles P=2 times

          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

          Server idles P = minK-1Q times

          3 Transport Layer 124Comp 361 Spring 2005

          TCP Latency Modeling (3)

          ementacknowledg receivesserver until

          segment send tostartsserver whenfrom time=+ RTTRS

          RS

          RSRTTPRTT

          RO

          RSRTT

          RSRTT

          RO

          idleTimeRTTRO

          P

          kP

          k

          P

          pp

          )12(][2

          ]2[2

          2delay

          1

          1

          1

          minusminus+++=

          minus+++=

          ++=

          minus

          =

          =

          sum

          sum

          th window after the timeidle 2 1 kRSRTT

          RS k =⎥⎦

          ⎤⎢⎣⎡ minus+

          +minus

          window kth the transmit totime2 1 =minus

          RSk

          RTT

          initiate TCPconnection

          requestobject

          first window= SR

          second window= 2SR

          third window= 4SR

          fourth window= 8SR

          completetransmissionobject

          delivered

          time atclient

          time atserver

          3 Transport Layer 125Comp 361 Spring 2005

          TCP Latency Modeling (4)Recall K = number of windows that cover object

          How do we calculate K

          ⎥⎥⎤

          ⎢⎢⎡ +=

          +ge=

          geminus=

          ge+++=

          ge+++=minus

          minus

          )1(log

          )1(logmin

          12min

          222min222min

          2

          2

          110

          110

          SO

          SOkk

          SOk

          SOkOSSSkK

          k

          k

          k

          L

          L

          Calculation of Q number of idles for infinite-size objectis similar

          3 Transport Layer 126Comp 361 Spring 2005

          HTTP ModelingAssume Web page consists of

          1 base HTML page (of size O bits)M images (each of size O bits)

          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

          3 Transport Layer 127Comp 361 Spring 2005

          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

          02468

          101214161820

          28Kbps

          100Kbps

          1 Mbps 10Mbps

          non-persistent

          persistent

          parallel non-persistent

          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

          3 Transport Layer 128Comp 361 Spring 2005

          HTTP Response time (in seconds)

          0

          10

          20

          30

          40

          50

          60

          70

          28Kbps

          100Kbps

          1 Mbps 10Mbps

          non-persistent

          persistent

          parallel non-persistent

          RTT =1 sec O = 5 Kbytes M=10 and X=5

          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

          3 Transport Layer 129Comp 361 Spring 2005

          Chapter 3 Summaryprinciples behind transport layer services

          multiplexing demultiplexingreliable data transferflow controlcongestion control

          instantiation and implementation in the Internet

          UDPTCP

          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

          • Chapter 3 Transport Layer last revised 160305
          • Chapter 3 outline
          • Transport services and protocols
          • Transport vs network layer
          • Transport-layer protocols
          • Chapter 3 outline
          • Multiplexingdemultiplexing
          • Multiplexingdemultiplexing
          • How demultiplexing works
          • Connectionless demultiplexing
          • Connectionless demux (cont)
          • Connection-oriented demux
          • Connection-oriented demux (cont)
          • Connection-oriented demux Threaded Web Server
          • Chapter 3 outline
          • UDP User Datagram Protocol [RFC 768]
          • UDP more
          • UDP checksum
          • Chapter 3 outline
          • Principles of Reliable data transfer
          • Reliable data transfer getting started
          • Reliable data transfer getting started
          • Incremental Improvements
          • Rdt10 reliable transfer over a reliable channel
          • Rdt20 channel with bit errors
          • rdt20 FSM specification
          • rdt20 operation with no errors
          • rdt20 error scenario
          • rdt20 has a fatal flaw
          • rdt21 sender handles garbled ACKNAKs
          • rdt21 receiver handles garbled ACKNAKs
          • rdt21 discussion
          • rdt22 a NAK-free protocol
          • rdt22 sender receiver fragments
          • rdt30 channels with errors and loss
          • rdt30 sender
          • rdt30 in action
          • rdt30 in action
          • Performance of rdt30
          • rdt30 stop-and-wait operation
          • Pipelined protocols
          • Pipelined protocols
          • Pipelining increased utilization
          • Go-Back-N
          • GBN Sender
          • GBN sender extended FSM
          • GBN receiver extended FSM
          • More on receiver
          • GBN inaction
          • Selective Repeat
          • Selective repeat sender receiver windows
          • Selective repeat
          • Selective repeat in action
          • Selective repeat dilemma
          • Chapter 3 outline
          • TCP Overview RFCs 793 1122 1323 2018 2581
          • More TCP Details
          • Even More TCP Details
          • TCP segment structure
          • TCP seq rsquos and ACKs
          • TCP Round Trip Time and Timeout
          • TCP Round Trip Time and Timeout
          • Example RTT estimation
          • TCP Round Trip Time and Timeout
          • Chapter 3 outline
          • TCP reliable data transfer
          • TCP sender events
          • TCP sender(simplified)
          • TCP retransmission scenarios
          • TCP retransmission scenarios (more)
          • TCP ACK generation [RFC 1122 RFC 2581]
          • More on Sender Policies
          • Fast Retransmit
          • Fast retransmit algorithm
          • TCP GBN or Selective Repeat
          • Chapter 3 outline
          • TCP Flow Control
          • TCP Flow Control
          • TCP segment structure
          • TCP Flow control how it works
          • Technical Issue
          • Chapter 3 outline
          • TCP Connection Management
          • TCP Connection Management (cont)
          • TCP Connection Management (cont)
          • TCP Connection Management (cont)
          • TCP Connection Management (cont)
          • A few special cases
          • Chapter 3 outline
          • Principles of Congestion Control
          • Causescosts of congestion scenario 1
          • Causescosts of congestion scenario 2
          • Causescosts of congestion scenario 3
          • Causescosts of congestion scenario 3
          • Approaches towards congestion control
          • Case study ATM ABR congestion control
          • Case study ATM ABR congestion control
          • Chapter 3 outline
          • TCP Congestion Control
          • TCP AIMD
          • TCP Slow Start
          • TCP Slow Start (more)
          • Summary TCP Congestion Control
          • The Big Picture
          • TCP sender congestion control
          • TCP throughput
          • TCP Futures
          • TCP Fairness
          • Why is TCP fair
          • Fairness (more)
          • TCP Latency Modeling
          • Fixed Congestion Window (W)
          • Fixed congestion window (1)
          • Fixed congestion window (2)
          • TCP Latency Modeling Slow Start (1)
          • TCP Latency Modeling Slow Start (2)
          • TCP Latency Modeling (3)
          • TCP Latency Modeling (4)
          • HTTP Modeling
          • Chapter 3 Summary

            3 Transport Layer 6Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 7Comp 361 Spring 2005

            Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

            Multiplexing at send host

            delivering received segmentsto correct socket

            Demultiplexing at rcv host

            = socket = process

            application

            transport

            network

            link

            physical

            P1 application

            transport

            network

            link

            physical

            application

            transport

            network

            link

            physical

            P2P3 P4P1

            host 1 host 2 host 3

            3 Transport Layer 8Comp 361 Spring 2005

            Multiplexingdemultiplexingsegment - unit of data

            exchanged between transport layer entities

            aka TPDU transport protocol data unit

            Demultiplexing delivering received segments to correct app layer processes

            receiver

            applicationtransportnetwork

            M P2applicationtransportnetwork

            HtHn segment

            segment Mapplicationtransportnetwork

            P1M

            M MP3 P4

            segmentheader

            application-layerdata

            3 Transport Layer 9Comp 361 Spring 2005

            How demultiplexing workshost receives IP datagrams

            each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

            host uses IP addresses amp port numbers to direct segment to appropriate socket

            source port dest port

            32 bits

            applicationdata

            (message)

            other header fields

            TCPUDP segment format

            3 Transport Layer 10Comp 361 Spring 2005

            Connectionless demultiplexingWhen host receives UDP segment

            checks destination port number in segmentdirects UDP segment to socket with that port number

            IP datagrams with different source IP addresses andor source port numbers directed to same socket

            Create sockets with port numbers

            DatagramSocket mySocket1 = new DatagramSocket(99111)

            DatagramSocket mySocket2 = new DatagramSocket(99222)

            UDP socket identified by two-tuple

            (dest IP address dest port number)

            3 Transport Layer 11Comp 361 Spring 2005

            Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

            ClientIPB

            P3

            clientIP A

            P1P1P3

            serverIP C

            SP 6428DP 9157

            SP 9157DP 6428

            SP 6428DP 5775

            SP 5775DP 6428

            SP provides ldquoreturn addressrdquo

            3 Transport Layer 12Comp 361 Spring 2005

            Connection-oriented demux

            TCP socket identified by 4-tuple

            source IP addresssource port numberdest IP addressdest port number

            recv host uses all four values to direct segment to appropriate socket

            Server host may support many simultaneous TCP sockets

            each socket identified by its own 4-tuple

            Web servers have different sockets for each connecting client

            non-persistent HTTP will have different socket for each request

            3 Transport Layer 13Comp 361 Spring 2005

            Connection-oriented demux(cont)

            ClientIPB

            P3

            clientIP A

            P1P1P3

            serverIP C

            SP 80DP 9157

            SP 9157DP 80

            SP 80DP 5775

            SP 5775DP 80

            P4

            3 Transport Layer 14Comp 361 Spring 2005

            Connection-oriented demux Threaded Web Server

            ClientIPB

            P1

            clientIP A

            P1P2

            serverIP C

            SP 9157DP 80

            SP 9157DP 80

            P4 P3

            D-IPCS-IP AD-IPC

            S-IP B

            SP 5775DP 80

            D-IPCS-IP B

            3 Transport Layer 15Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 16Comp 361 Spring 2005

            UDP User Datagram Protocol [RFC 768]

            ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

            lostdelivered out of order to app

            connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

            Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

            3 Transport Layer 17Comp 361 Spring 2005

            UDP moreoften used for streaming multimedia apps

            loss tolerantrate sensitive

            other UDP uses (why)

            DNS small delaySNMP stressful cond

            reliable transfer over UDP add reliability at application layer

            application-specific error recover

            source port dest port

            32 bits

            Applicationdata

            (message)

            length checksumLength in

            bytes of UDPsegmentincluding

            header

            UDP segment format

            3 Transport Layer 18Comp 361 Spring 2005

            UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

            segment

            Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

            NO - error detectedYES - no error detected But maybe errors nonetheless More later

            Receiver may choose to discard segment or send a warning to app in case error

            Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

            3 Transport Layer 19Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 20Comp 361 Spring 2005

            Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

            characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

            3 Transport Layer 21Comp 361 Spring 2005

            Reliable data transfer getting started

            sendside

            receiveside

            rdt_send() called from above (eg by app) Passed data to

            deliver to receiver upper layer

            udt_send() called by rdtto transfer packet over

            unreliable channel to receiver

            rdt_rcv() called when packet arrives on rcv-side of channel

            deliver_data() called by rdt to deliver data to upper

            3 Transport Layer 22Comp 361 Spring 2005

            Reliable data transfer getting startedWersquoll

            incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

            but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

            state1

            state2

            event causing state transitionactions taken on state transition

            state when in this ldquostaterdquo next state

            uniquely determined by next event

            eventactions

            3 Transport Layer 23Comp 361 Spring 2005

            Incremental Improvements

            rdt10 assumes every packet sent arrives and no errors introduced in transmission

            rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

            rdt21 deals with corrupted ACKSNAKS

            rdt22 like rdt21 but does not need NAKs

            Rdt30 Allows packets to be lost

            Rdt10 reliable transfer over a reliable channel

            underlying channel perfectly reliableno bit errorsno loss of packets

            separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

            Wait for call from above packet = make_pkt(data)

            udt_send(packet)

            rdt_send(data)extract (packetdata)deliver_data(data)

            Wait for call from

            below

            rdt_rcv(packet)

            sender receiver

            3 Transport Layer 24Comp 361 Spring 2005

            3 Transport Layer 25Comp 361 Spring 2005

            Rdt20 channel with bit errors

            underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

            the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

            new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

            3 Transport Layer 26Comp 361 Spring 2005

            rdt20 FSM specification

            Wait for call from above

            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

            udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

            udt_send(NAK)

            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

            Wait for ACK or

            NAK

            rdt_send(data)

            receiver

            Wait for call from

            below

            Λ

            sender

            3 Transport Layer 27Comp 361 Spring 2005

            rdt20 operation with no errors

            Wait for call from above

            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

            udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

            udt_send(NAK)

            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

            Wait for ACK or

            NAK

            Wait for call from

            below

            rdt_send(data)

            Λ

            3 Transport Layer 28Comp 361 Spring 2005

            rdt20 error scenario

            Wait for call from above

            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

            udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

            udt_send(NAK)

            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

            Wait for ACK or

            NAK

            Wait for call from

            below

            rdt_send(data)

            Λ

            3 Transport Layer 29Comp 361 Spring 2005

            rdt20 has a fatal flawWhat happens if ACKNAK

            corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

            What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

            Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

            Sender sends one packet then waits for receiver response

            stop and wait

            3 Transport Layer 30Comp 361 Spring 2005

            Sender whenever sender receives control message it sends a packet to receiver

            A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

            Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

            Note ACKNAK do not contain sequence

            3 Transport Layer 31Comp 361 Spring 2005

            rdt21 sender handles garbled ACKNAKs

            Wait for call 0 from

            above

            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

            rdt_send(data)

            Wait for ACK or NAK 0 udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

            rdt_send(data)

            udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

            Wait forcall 1 from

            above

            Wait for ACK or NAK 1

            ΛΛ

            3 Transport Layer 32Comp 361 Spring 2005

            rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

            ampamp has_seq0(rcvpkt)

            Wait for 0 from below

            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

            Wait for 1 from below

            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

            3 Transport Layer 33Comp 361 Spring 2005

            rdt21 discussion

            Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

            state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

            Receivermust check if received packet is duplicate

            state indicates whether 0 or 1 is expected pkt seq

            note receiver can notknow if its last ACKNAK received OK at sender

            3 Transport Layer 34Comp 361 Spring 2005

            rdt22 a NAK-free protocol

            same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

            receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

            duplicate ACK at sender results in same action as NAK retransmit current pkt

            3 Transport Layer 35Comp 361 Spring 2005

            rdt22 sender receiver fragments

            Wait for call 0 from

            above

            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

            rdt_send(data)

            udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

            isACK(rcvpkt1) )

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

            Wait for ACK

            0sender FSM

            fragment

            Wait for 0 from below

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

            has_seq1(rcvpkt))

            udt_send(sndpkt)receiver FSM

            fragment

            Λ

            3 Transport Layer 36Comp 361 Spring 2005

            rdt30 channels with errors and loss

            New assumptionunderlying channel can also lose packets (data or ACKs)

            checksum seq ACKs retransmissions will be of help but not enough

            Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

            Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

            retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

            requires countdown timer

            3 Transport Layer 37Comp 361 Spring 2005

            rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

            rdt_send(data)

            Wait for

            ACK0

            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

            Wait for call 1 from

            above

            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

            rdt_send(data)

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

            stop_timerstop_timer

            udt_send(sndpkt)start_timer

            timeout

            udt_send(sndpkt)start_timer

            timeout

            rdt_rcv(rcvpkt)

            Wait for call 0from

            above

            Wait for

            ACK1

            Λrdt_rcv(rcvpkt)

            ΛΛ

            Λ

            3 Transport Layer 38Comp 361 Spring 2005

            rdt30 in action

            3 Transport Layer 39Comp 361 Spring 2005

            rdt30 in action

            3 Transport Layer 40Comp 361 Spring 2005

            Performance of rdt30

            rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

            L (packet length in bits)R (transmission rate bps)

            8kbpkt109 bsec

            Ttransmit = = = 8 microsec

            U sender =

            00830008

            = 000027 L R RTT + L R

            =

            U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

            rdt30 stop-and-wait operation

            first packet bit transmitted t = 0

            sender receiver

            RTT

            last packet bit transmitted t = L R

            first packet bit arriveslast packet bit arrives send ACK

            ACK arrives send next packet t = RTT + L R

            U sender =

            008 30008

            = 000027 L R RTT + L R

            =

            3 Transport Layer 41Comp 361 Spring 2005

            3 Transport Layer 42Comp 361 Spring 2005

            Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

            range of sequence numbers must be increasedbuffering at sender andor receiver

            3 Transport Layer 43Comp 361 Spring 2005

            Pipelined protocols

            Advantage much better bandwidth utilization than stop-and-wait

            Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

            Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

            Note TCP is not exactly either

            Pipelining increased utilization

            first packet bit transmitted t = 0

            sender receiver

            RTT

            last bit transmitted t = L R

            first packet bit arriveslast packet bit arrives send ACK

            ACK arrives send next packet t = RTT + L R

            last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

            U sender =

            02430008

            = 00008 3 L R RTT + L R

            =

            Increase utilizationby a factor of 3

            3 Transport Layer 44Comp 361 Spring 2005

            3 Transport Layer 45Comp 361 Spring 2005

            Go-Back-NSender

            k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

            ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

            Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

            3 Transport Layer 46Comp 361 Spring 2005

            GBN Sender

            rdt_Send() called checks to see if window is full No send out packetYes return data to application level

            Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

            Timeout resends ALL packets that have been sent but not yet acknowledged

            This is only event that triggers resend

            3 Transport Layer 47Comp 361 Spring 2005

            GBN sender extended FSMrdt_send(data)

            Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

            timeout

            if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

            start_timernextseqnum++

            elserefuse_data(data)

            base = getacknum(rcvpkt)+1If (base == nextseqnum)

            stop_timerelse

            start_timer

            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

            base=1nextseqnum=1

            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

            Λ

            3 Transport Layer 48Comp 361 Spring 2005

            GBN receiver extended FSM

            Wait

            udt_send(sndpkt)default

            rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

            expectedseqnum=1sndpkt =

            make_pkt(0ACKchksum)

            Λ

            If expected packet receivedSend ACK and deliver packet upstairs

            If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

            3 Transport Layer 49Comp 361 Spring 2005

            More on receiver

            The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

            3 Transport Layer 50Comp 361 Spring 2005

            GBN inaction

            GBN is easy to code but might have performance problems

            In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

            Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

            3 Transport Layer 51Comp 361 Spring 2005

            3 Transport Layer 52Comp 361 Spring 2005

            Selective Repeat

            receiver individually acknowledges all correctly received pkts

            buffers pkts as needed for eventual in-order delivery to upper layer

            sender only resends pkts for which ACK not received

            sender timer for each unACKed pktCompare to GBN which only had timer for base packet

            sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

            3 Transport Layer 53Comp 361 Spring 2005

            Selective repeat sender receiver windows

            3 Transport Layer 54Comp 361 Spring 2005

            Selective repeat

            pkt n in [rcvbase rcvbase+N-1]

            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

            pkt n in [rcvbase-Nrcvbase-1]

            ACK(n) (note this is a reACK)

            otherwiseignore

            receiverdata from above

            if next available seq in window send pkt

            timeout(n)resend pkt n restart timer

            ACK(n) in [sendbasesendbase+N]

            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

            sender

            3 Transport Layer 55Comp 361 Spring 2005

            Selective repeat in action

            3 Transport Layer 56Comp 361 Spring 2005

            Selective repeatdilemma

            Example seq rsquos 0 1 2 3window size=3

            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

            Q what is relationship between seq size and window size

            3 Transport Layer 57Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 58Comp 361 Spring 2005

            TCP Overview RFCs 793 1122 1323 2018 2581

            full duplex databi-directional data flow in same connectionMSS maximum segment size

            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

            flow controlledsender will not overwhelm receiver

            point-to-pointone sender one receiver

            reliable in-order byte steam

            no ldquomessage boundariesrdquopipelined

            TCP congestion and flow control set window size

            send amp receive buffers

            socketdoor

            TCPsend buffer

            TCPreceive buffer

            socketdoor

            segment

            applicationwrites data

            applicationreads data

            3 Transport Layer 59Comp 361 Spring 2005

            More TCP DetailsMaximum Segment Size (MSS)

            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

            Application Data + TCP Header = TCP Segment

            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

            (again no payload)Client responds with third special segment

            This can contain payload

            3 Transport Layer 60Comp 361 Spring 2005

            Even More TCP Details

            A TCP connection between client and server creates in both client and server

            (i) buffers(ii) variables and

            (iii) a socket connection to process

            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

            any of the network elements between the host and server

            3 Transport Layer 61Comp 361 Spring 2005

            TCP segment structure

            source port dest port

            32 bits

            applicationdata

            (variable length)

            sequence numberacknowledgement number

            Receive windowUrg data pnterchecksum

            FSRPAUheadlen

            notused

            Options (variable length)

            URG urgent data (generally not used)

            ACK ACK valid

            PSH push data now(generally not used)

            RST SYN FINconnection estab(setup teardown

            commands)

            bytes rcvr willingto accept

            Internetchecksum

            (as in UDP)

            countingby bytes of data(not segments)

            3 Transport Layer 62Comp 361 Spring 2005

            TCP seq rsquos and ACKsSeq rsquos

            byte stream ldquonumberrdquo of first byte in segmentrsquos data

            ACKsseq of next byte expected from other sidecumulative ACK

            Q how receiver handles out-of-order segments

            A TCP spec doesnrsquot say - up to implementer

            Host BHost A

            Seq=42 ACK=79 data = lsquoCrsquo

            Seq=79 ACK=43 data = lsquoCrsquo

            Seq=43 ACK=80

            Usertypes

            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

            back lsquoCrsquo

            host ACKsreceipt

            of echoedlsquoCrsquo

            timesimple telnet scenario

            3 Transport Layer 63Comp 361 Spring 2005

            TCP Round Trip Time and Timeout

            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

            average several recent measurements not just current SampleRTT

            Q how to set TCP timeout valuelonger than RTT

            but RTT variestoo short premature timeout

            unnecessary retransmissions

            too long slow reaction to segment loss

            3 Transport Layer 64Comp 361 Spring 2005

            TCP Round Trip Time and Timeout

            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

            3 Transport Layer 65Comp 361 Spring 2005

            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

            100

            150

            200

            250

            300

            350

            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

            time (seconnds)

            RTT

            (mill

            iseco

            nds)

            SampleRTT Estimated RTT

            3 Transport Layer 66Comp 361 Spring 2005

            TCP Round Trip Time and Timeout

            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

            (typically β = 025)

            Then set timeout interval

            TimeoutInterval = EstimatedRTT + 4DevRTT

            3 Transport Layer 67Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 68Comp 361 Spring 2005

            TCP reliable data transfer

            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

            Retransmissions are triggered by

            timeout eventsduplicate acks

            Initially consider simplified TCP sender

            ignore duplicate acksignore flow control congestion control

            3 Transport Layer 69Comp 361 Spring 2005

            TCP sender eventsdata rcvd from app

            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

            timeoutretransmit segment that caused timeoutrestart timer

            Ack rcvdIf acknowledges previously unackedsegments

            update what is known to be ackedstart timer if there are outstanding segments

            TCP sender(simplified)

            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

            loop (forever) switch(event)

            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

            event timer timeoutretransmit not-yet-acknowledged segment with

            smallest sequence numberstart timer

            event ACK received with ACK field value of y if (y gt SendBase)

            SendBase = yif (there are currently not-yet-acknowledged segments)

            start timer

            end of loop forever

            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

            3 Transport Layer 70Comp 361 Spring 2005

            3 Transport Layer 71Comp 361 Spring 2005

            TCP retransmission scenariosHost A

            Seq=100 20 bytes data

            ACK=100

            timepremature timeout

            Host B

            Seq=92 8 bytes data

            ACK=120

            Seq=92 8 bytes data

            Seq=

            92 t

            imeo

            ut

            ACK=120

            Host A

            Seq=92 8 bytes data

            ACK=100

            loss

            tim

            eout

            lost ACK scenario

            Host B

            X

            Seq=92 8 bytes data

            ACK=100

            time

            SendBase= 120

            SendBase= 120

            Sendbase= 100

            Seq=

            92 t

            imeo

            utSendBase

            = 100

            3 Transport Layer 72Comp 361 Spring 2005

            TCP retransmission scenarios (more)Host A

            Seq=92 8 bytes data

            ACK=100

            loss

            tim

            eout

            Cumulative ACK scenario

            Host B

            X

            Seq=100 20 bytes data

            ACK=120

            time

            SendBase= 120

            3 Transport Layer 73Comp 361 Spring 2005

            TCP ACK generation [RFC 1122 RFC 2581]

            Event at Receiver

            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

            Arrival of in-order segment withexpected seq One other segment has ACK pending

            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

            Arrival of segment that partially or completely fills gap

            TCP Receiver action

            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

            Immediately send single cumulative ACK ACKing both in-order segments

            Immediately send duplicate ACK indicating seq of next expected byte

            Immediate send ACK provided thatsegment starts at lower end of gap

            3 Transport Layer 74Comp 361 Spring 2005

            More on Sender Policies

            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

            3 Transport Layer 75Comp 361 Spring 2005

            Fast Retransmit

            Time-out period often relatively long

            long delay before resending lost packet

            Detect lost segments via duplicate ACKs

            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

            fast retransmit resend segment before timer expires

            3 Transport Layer 76Comp 361 Spring 2005

            Fast retransmit algorithm

            event ACK received with ACK field value of y if (y gt SendBase)

            SendBase = yif (there are currently not-yet-acknowledged segments)

            start timer

            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

            resend segment with sequence number y

            a duplicate ACK for already ACKed segment

            fast retransmit

            3 Transport Layer 77Comp 361 Spring 2005

            TCP GBN or Selective Repeat

            Basic TCP looks a lot like GBN

            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

            This looks a lot like Selective Repeat

            TCP is a hybrid

            3 Transport Layer 78Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 79Comp 361 Spring 2005

            TCP Flow Control

            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

            3 Transport Layer 80Comp 361 Spring 2005

            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

            transmitting too muchtoo fast

            flow controlreceive side of TCP connection has a receive buffer

            speed-matching service matching the send rate to the receiving apprsquos drain rate

            app process may be slow at reading from buffer

            3 Transport Layer 81Comp 361 Spring 2005

            TCP segment structure

            source port dest port

            32 bits

            applicationdata

            (variable length)

            sequence numberacknowledgement number

            Receive windowUrg data pnterchecksum

            FSRPAUheadlen

            notused

            Options (variable length)

            URG urgent data (generally not used)

            ACK ACK valid

            PSH push data now(generally not used)

            RST SYN FINconnection estab(setup teardown

            commands)

            bytes rcvr willingto accept

            Internetchecksum

            (as in UDP)

            countingby bytes of data(not segments)

            3 Transport Layer 82Comp 361 Spring 2005

            TCP Flow control how it works

            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

            = RcvWindow= RcvBuffer-[LastByteRcvd -

            LastByteRead]

            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

            guarantees receive buffer doesnrsquot overflow

            3 Transport Layer 83Comp 361 Spring 2005

            Technical Issue

            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

            3 Transport Layer 84Comp 361 Spring 2005

            Note on UDP

            UDP has no flow control

            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

            3 Transport Layer 85Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 86Comp 361 Spring 2005

            TCP Connection Management

            Three way handshakeStep 1 client end system sends

            TCP SYN control segment to server

            specifies client_isn the initial seq No application data

            Step 2 server end system receives SYN replies with SYNACK control segment

            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

            seq sbuffers flow control info (eg RcvWindow)

            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

            3 Transport Layer 87Comp 361 Spring 2005

            TCP Connection Management (cont)

            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

            Allocate buffersAllocates buffersCan include application data

            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

            clientConnection request (SYN=1 seq=client_isn)

            server

            Connection granted (SYN=1 server_isn

            ACK (SYN=0 seq=client_isn+1)

            ack=client_isn+1)

            ack=server_isn+1

            3 Transport Layer 88Comp 361 Spring 2005

            TCP Connection Management (cont)

            Closing a connection

            client closes socketclientSocketclose()

            Step 1 client end system sends TCP FIN control segment to server

            Step 2 server receives FIN replies with ACK Closes connection sends FIN

            client

            FIN

            server

            ACK

            ACK

            FIN

            close

            close

            closed

            tim

            ed w

            ait

            3 Transport Layer 89Comp 361 Spring 2005

            TCP Connection Management (cont)

            Step 3 client receives FIN replies with ACK

            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

            Closes down after timed-wait

            Step 4 server receives ACK Connection closed

            Note with small modification can handle simultaneous FINs

            client

            FIN

            server

            ACK

            ACK

            FIN

            closing

            closing

            closed

            tim

            ed w

            ait

            closed

            3 Transport Layer 90Comp 361 Spring 2005

            TCP Connection Management (cont)

            ExampleTCP serverlifecycle

            Example TCP clientlifecycle

            3 Transport Layer 91Comp 361 Spring 2005

            A few special cases

            Have not discussed what happens if both client and server decide to close down connection at same time

            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

            3 Transport Layer 92Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 93Comp 361 Spring 2005

            Principles of Congestion Control

            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

            a top-10 problem

            3 Transport Layer 94Comp 361 Spring 2005

            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

            large delays when congestedmaximum achievable throughput

            3 Transport Layer 95Comp 361 Spring 2005

            Causescosts of congestion scenario 2

            one router finite buffers sender retransmission of lost packet

            3 Transport Layer 96Comp 361 Spring 2005

            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

            λin λout=

            λin λoutgtλ

            inλout

            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

            (c)(a) (b)

            3 Transport Layer 97Comp 361 Spring 2005

            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

            λin

            Q what happens as and increase λ

            in

            3 Transport Layer 98Comp 361 Spring 2005

            Causescosts of congestion scenario 3

            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

            3 Transport Layer 99Comp 361 Spring 2005

            Approaches towards congestion control

            Two broad approaches towards congestion control

            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

            Network-assisted congestion controlrouters provide feedback to end systems

            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

            3 Transport Layer 100Comp 361 Spring 2005

            Case study ATM ABR congestion control

            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

            RM cells returned to sender by receiver with bits intact

            small exception ndash see next page

            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

            sender should use available bandwidth

            if senderrsquos path congested sender throttled to minimum guaranteed rate

            3 Transport Layer 101Comp 361 Spring 2005

            Case study ATM ABR congestion control

            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

            3 Transport Layer 102Comp 361 Spring 2005

            Chapter 3 outline

            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

            35 Connection-oriented transport TCP

            segment structurereliable data transferflow controlconnection management

            36 Principles of congestion control37 TCP congestion control

            3 Transport Layer 103Comp 361 Spring 2005

            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

            Congwin

            w segments each with MSS bytes sent in one RTT

            throughput = w MSSRTT Bytessec

            3 Transport Layer 104Comp 361 Spring 2005

            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

            Tools are ldquosimilarrdquo to flow control sender limits transmission using

            LastByteSent-LastByteAcked le CongWin

            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

            3 Transport Layer 105Comp 361 Spring 2005

            TCP AIMDmultiplicative decrease additive increase increase

            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

            cut CongWin in half after loss event

            8 Kbytes

            16 Kbytes

            24 Kbytes

            time

            congestionwindow

            Long-lived TCP connection

            3 Transport Layer 106Comp 361 Spring 2005

            TCP Slow Start

            When connection begins CongWin = 1 MSS

            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

            available bandwidth may be gtgt MSSRTT

            desirable to quickly ramp up to respectable rate

            When connection begins increase rate exponentially fast until first loss event

            3 Transport Layer 107Comp 361 Spring 2005

            TCP Slow Start (more)

            When connection begins increase rate exponentially until first loss event

            double CongWin every RTTdone by incrementing CongWin for every ACK received

            Summary initial rate is slow but ramps up exponentially fast

            Host A

            one segment

            RTT

            Host B

            time

            two segments

            four segments

            3 Transport Layer 108Comp 361 Spring 2005

            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

            3 Transport Layer 109Comp 361 Spring 2005

            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

            3 Transport Layer 110Comp 361 Spring 2005

            Summary TCP Congestion Control

            When CongWin is below Threshold sender in slow-start phase window grows exponentially

            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

            3 Transport Layer 111Comp 361 Spring 2005

            The Big Picture

            3 Transport Layer 112Comp 361 Spring 2005

            TCP sender congestion controlEvent State TCP Sender Action Commentary

            ACK receipt for previously unackeddata

            Slow Start (SS)

            CongWin = CongWin + MSS If (CongWin gt Threshold)

            set state to ldquoCongestion Avoidancerdquo

            Resulting in a doubling of CongWin every RTT

            ACK receipt for previously unackeddata

            CongestionAvoidance (CA)

            CongWin = CongWin+MSS (MSSCongWin)

            Additive increase resulting in increase of CongWin by 1 MSS every RTT

            Loss event detected by triple duplicate ACK

            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

            Enter slow start

            Duplicate ACK

            SS or CA Increment duplicate ACK count for segment being acked

            CongWin and Threshold not changed

            3 Transport Layer 113Comp 361 Spring 2005

            TCP throughput

            Whatrsquos the average throughput of TCP as a function of window size and RTT

            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

            3 Transport Layer 114Comp 361 Spring 2005

            TCP Futures

            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

            L = 210-10 WowNew versions of TCP for high-speed needed

            LRTTMSSsdot221

            3 Transport Layer 115Comp 361 Spring 2005

            TCP FairnessFairness goal if K TCP sessions share same

            bottleneck link of bandwidth R each should have average rate of RK

            TCP connection 1

            bottleneckrouter

            capacity R

            TCP connection 2

            3 Transport Layer 116Comp 361 Spring 2005

            Why is TCP fairTwo competing sessions

            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

            R

            R

            equal bandwidth share

            Connection 1 throughput

            Conn

            ecti

            on 2

            thr

            ough

            p ut

            congestion avoidance additive increaseloss decrease window by factor of 2

            congestion avoidance additive increaseloss decrease window by factor of 2

            3 Transport Layer 117Comp 361 Spring 2005

            Fairness (more)Fairness and UDP

            Multimedia apps often do not use TCP

            do not want rate throttled by congestion control

            Instead use UDPpump audiovideo at constant rate tolerate packet loss

            Current Research area How to keep UDP from congesting the internet

            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

            3 Transport Layer 118Comp 361 Spring 2005

            TCP Latency ModelingNotation assumptions

            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

            modeling slow start

            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

            3 Transport Layer 119Comp 361 Spring 2005

            Fixed Congestion Window (W)Two cases

            1 WSR gt RTT + SR ACK for first segment in window returns before

            windowrsquos worth of data sentLatency = 2RTT + OR

            2 WSR lt RTT + SR ACK for first segment in window returns after

            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

            3 Transport Layer 120Comp 361 Spring 2005

            Fixed congestion window (1)

            First caseWSR gt RTT + SR ACK for

            first segment in window returns before windowrsquos worth of data sent

            latency = 2RTT + OR

            3 Transport Layer 121Comp 361 Spring 2005

            Fixed congestion window (2)

            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

            3 Transport Layer 122Comp 361 Spring 2005

            TCP Latency Modeling Slow Start (1)

            Now suppose window grows according to slow start(with no threshold and no loss events)

            Will show that the delay for one object is

            RS

            RSRTTP

            RORTTLatency P )12(2 minusminus⎥⎦

            ⎤⎢⎣⎡ +++=

            where P is the number of times TCP idles at server1min minus= KQP

            - where Q is the number of times the server idlesif the object were of infinite size

            - and K is the number of windows that cover the object

            3 Transport Layer 123Comp 361 Spring 2005

            TCP Latency Modeling Slow Start (2)

            RTT

            initiate TCPconnection

            requestobject

            first window= SR

            second window= 2SR

            third window= 4SR

            fourth window= 8SR

            completetransmissionobject

            delivered

            time atclient

            time atserver

            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

            Server idles P=2 times

            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

            Server idles P = minK-1Q times

            3 Transport Layer 124Comp 361 Spring 2005

            TCP Latency Modeling (3)

            ementacknowledg receivesserver until

            segment send tostartsserver whenfrom time=+ RTTRS

            RS

            RSRTTPRTT

            RO

            RSRTT

            RSRTT

            RO

            idleTimeRTTRO

            P

            kP

            k

            P

            pp

            )12(][2

            ]2[2

            2delay

            1

            1

            1

            minusminus+++=

            minus+++=

            ++=

            minus

            =

            =

            sum

            sum

            th window after the timeidle 2 1 kRSRTT

            RS k =⎥⎦

            ⎤⎢⎣⎡ minus+

            +minus

            window kth the transmit totime2 1 =minus

            RSk

            RTT

            initiate TCPconnection

            requestobject

            first window= SR

            second window= 2SR

            third window= 4SR

            fourth window= 8SR

            completetransmissionobject

            delivered

            time atclient

            time atserver

            3 Transport Layer 125Comp 361 Spring 2005

            TCP Latency Modeling (4)Recall K = number of windows that cover object

            How do we calculate K

            ⎥⎥⎤

            ⎢⎢⎡ +=

            +ge=

            geminus=

            ge+++=

            ge+++=minus

            minus

            )1(log

            )1(logmin

            12min

            222min222min

            2

            2

            110

            110

            SO

            SOkk

            SOk

            SOkOSSSkK

            k

            k

            k

            L

            L

            Calculation of Q number of idles for infinite-size objectis similar

            3 Transport Layer 126Comp 361 Spring 2005

            HTTP ModelingAssume Web page consists of

            1 base HTML page (of size O bits)M images (each of size O bits)

            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

            3 Transport Layer 127Comp 361 Spring 2005

            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

            02468

            101214161820

            28Kbps

            100Kbps

            1 Mbps 10Mbps

            non-persistent

            persistent

            parallel non-persistent

            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

            3 Transport Layer 128Comp 361 Spring 2005

            HTTP Response time (in seconds)

            0

            10

            20

            30

            40

            50

            60

            70

            28Kbps

            100Kbps

            1 Mbps 10Mbps

            non-persistent

            persistent

            parallel non-persistent

            RTT =1 sec O = 5 Kbytes M=10 and X=5

            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

            3 Transport Layer 129Comp 361 Spring 2005

            Chapter 3 Summaryprinciples behind transport layer services

            multiplexing demultiplexingreliable data transferflow controlcongestion control

            instantiation and implementation in the Internet

            UDPTCP

            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

            • Chapter 3 Transport Layer last revised 160305
            • Chapter 3 outline
            • Transport services and protocols
            • Transport vs network layer
            • Transport-layer protocols
            • Chapter 3 outline
            • Multiplexingdemultiplexing
            • Multiplexingdemultiplexing
            • How demultiplexing works
            • Connectionless demultiplexing
            • Connectionless demux (cont)
            • Connection-oriented demux
            • Connection-oriented demux (cont)
            • Connection-oriented demux Threaded Web Server
            • Chapter 3 outline
            • UDP User Datagram Protocol [RFC 768]
            • UDP more
            • UDP checksum
            • Chapter 3 outline
            • Principles of Reliable data transfer
            • Reliable data transfer getting started
            • Reliable data transfer getting started
            • Incremental Improvements
            • Rdt10 reliable transfer over a reliable channel
            • Rdt20 channel with bit errors
            • rdt20 FSM specification
            • rdt20 operation with no errors
            • rdt20 error scenario
            • rdt20 has a fatal flaw
            • rdt21 sender handles garbled ACKNAKs
            • rdt21 receiver handles garbled ACKNAKs
            • rdt21 discussion
            • rdt22 a NAK-free protocol
            • rdt22 sender receiver fragments
            • rdt30 channels with errors and loss
            • rdt30 sender
            • rdt30 in action
            • rdt30 in action
            • Performance of rdt30
            • rdt30 stop-and-wait operation
            • Pipelined protocols
            • Pipelined protocols
            • Pipelining increased utilization
            • Go-Back-N
            • GBN Sender
            • GBN sender extended FSM
            • GBN receiver extended FSM
            • More on receiver
            • GBN inaction
            • Selective Repeat
            • Selective repeat sender receiver windows
            • Selective repeat
            • Selective repeat in action
            • Selective repeat dilemma
            • Chapter 3 outline
            • TCP Overview RFCs 793 1122 1323 2018 2581
            • More TCP Details
            • Even More TCP Details
            • TCP segment structure
            • TCP seq rsquos and ACKs
            • TCP Round Trip Time and Timeout
            • TCP Round Trip Time and Timeout
            • Example RTT estimation
            • TCP Round Trip Time and Timeout
            • Chapter 3 outline
            • TCP reliable data transfer
            • TCP sender events
            • TCP sender(simplified)
            • TCP retransmission scenarios
            • TCP retransmission scenarios (more)
            • TCP ACK generation [RFC 1122 RFC 2581]
            • More on Sender Policies
            • Fast Retransmit
            • Fast retransmit algorithm
            • TCP GBN or Selective Repeat
            • Chapter 3 outline
            • TCP Flow Control
            • TCP Flow Control
            • TCP segment structure
            • TCP Flow control how it works
            • Technical Issue
            • Chapter 3 outline
            • TCP Connection Management
            • TCP Connection Management (cont)
            • TCP Connection Management (cont)
            • TCP Connection Management (cont)
            • TCP Connection Management (cont)
            • A few special cases
            • Chapter 3 outline
            • Principles of Congestion Control
            • Causescosts of congestion scenario 1
            • Causescosts of congestion scenario 2
            • Causescosts of congestion scenario 3
            • Causescosts of congestion scenario 3
            • Approaches towards congestion control
            • Case study ATM ABR congestion control
            • Case study ATM ABR congestion control
            • Chapter 3 outline
            • TCP Congestion Control
            • TCP AIMD
            • TCP Slow Start
            • TCP Slow Start (more)
            • Summary TCP Congestion Control
            • The Big Picture
            • TCP sender congestion control
            • TCP throughput
            • TCP Futures
            • TCP Fairness
            • Why is TCP fair
            • Fairness (more)
            • TCP Latency Modeling
            • Fixed Congestion Window (W)
            • Fixed congestion window (1)
            • Fixed congestion window (2)
            • TCP Latency Modeling Slow Start (1)
            • TCP Latency Modeling Slow Start (2)
            • TCP Latency Modeling (3)
            • TCP Latency Modeling (4)
            • HTTP Modeling
            • Chapter 3 Summary

              3 Transport Layer 7Comp 361 Spring 2005

              Multiplexingdemultiplexinggathering data from multiplesockets enveloping data with header (later used for demultiplexing)

              Multiplexing at send host

              delivering received segmentsto correct socket

              Demultiplexing at rcv host

              = socket = process

              application

              transport

              network

              link

              physical

              P1 application

              transport

              network

              link

              physical

              application

              transport

              network

              link

              physical

              P2P3 P4P1

              host 1 host 2 host 3

              3 Transport Layer 8Comp 361 Spring 2005

              Multiplexingdemultiplexingsegment - unit of data

              exchanged between transport layer entities

              aka TPDU transport protocol data unit

              Demultiplexing delivering received segments to correct app layer processes

              receiver

              applicationtransportnetwork

              M P2applicationtransportnetwork

              HtHn segment

              segment Mapplicationtransportnetwork

              P1M

              M MP3 P4

              segmentheader

              application-layerdata

              3 Transport Layer 9Comp 361 Spring 2005

              How demultiplexing workshost receives IP datagrams

              each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

              host uses IP addresses amp port numbers to direct segment to appropriate socket

              source port dest port

              32 bits

              applicationdata

              (message)

              other header fields

              TCPUDP segment format

              3 Transport Layer 10Comp 361 Spring 2005

              Connectionless demultiplexingWhen host receives UDP segment

              checks destination port number in segmentdirects UDP segment to socket with that port number

              IP datagrams with different source IP addresses andor source port numbers directed to same socket

              Create sockets with port numbers

              DatagramSocket mySocket1 = new DatagramSocket(99111)

              DatagramSocket mySocket2 = new DatagramSocket(99222)

              UDP socket identified by two-tuple

              (dest IP address dest port number)

              3 Transport Layer 11Comp 361 Spring 2005

              Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

              ClientIPB

              P3

              clientIP A

              P1P1P3

              serverIP C

              SP 6428DP 9157

              SP 9157DP 6428

              SP 6428DP 5775

              SP 5775DP 6428

              SP provides ldquoreturn addressrdquo

              3 Transport Layer 12Comp 361 Spring 2005

              Connection-oriented demux

              TCP socket identified by 4-tuple

              source IP addresssource port numberdest IP addressdest port number

              recv host uses all four values to direct segment to appropriate socket

              Server host may support many simultaneous TCP sockets

              each socket identified by its own 4-tuple

              Web servers have different sockets for each connecting client

              non-persistent HTTP will have different socket for each request

              3 Transport Layer 13Comp 361 Spring 2005

              Connection-oriented demux(cont)

              ClientIPB

              P3

              clientIP A

              P1P1P3

              serverIP C

              SP 80DP 9157

              SP 9157DP 80

              SP 80DP 5775

              SP 5775DP 80

              P4

              3 Transport Layer 14Comp 361 Spring 2005

              Connection-oriented demux Threaded Web Server

              ClientIPB

              P1

              clientIP A

              P1P2

              serverIP C

              SP 9157DP 80

              SP 9157DP 80

              P4 P3

              D-IPCS-IP AD-IPC

              S-IP B

              SP 5775DP 80

              D-IPCS-IP B

              3 Transport Layer 15Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 16Comp 361 Spring 2005

              UDP User Datagram Protocol [RFC 768]

              ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

              lostdelivered out of order to app

              connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

              Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

              3 Transport Layer 17Comp 361 Spring 2005

              UDP moreoften used for streaming multimedia apps

              loss tolerantrate sensitive

              other UDP uses (why)

              DNS small delaySNMP stressful cond

              reliable transfer over UDP add reliability at application layer

              application-specific error recover

              source port dest port

              32 bits

              Applicationdata

              (message)

              length checksumLength in

              bytes of UDPsegmentincluding

              header

              UDP segment format

              3 Transport Layer 18Comp 361 Spring 2005

              UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

              segment

              Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

              NO - error detectedYES - no error detected But maybe errors nonetheless More later

              Receiver may choose to discard segment or send a warning to app in case error

              Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

              3 Transport Layer 19Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 20Comp 361 Spring 2005

              Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

              characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

              3 Transport Layer 21Comp 361 Spring 2005

              Reliable data transfer getting started

              sendside

              receiveside

              rdt_send() called from above (eg by app) Passed data to

              deliver to receiver upper layer

              udt_send() called by rdtto transfer packet over

              unreliable channel to receiver

              rdt_rcv() called when packet arrives on rcv-side of channel

              deliver_data() called by rdt to deliver data to upper

              3 Transport Layer 22Comp 361 Spring 2005

              Reliable data transfer getting startedWersquoll

              incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

              but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

              state1

              state2

              event causing state transitionactions taken on state transition

              state when in this ldquostaterdquo next state

              uniquely determined by next event

              eventactions

              3 Transport Layer 23Comp 361 Spring 2005

              Incremental Improvements

              rdt10 assumes every packet sent arrives and no errors introduced in transmission

              rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

              rdt21 deals with corrupted ACKSNAKS

              rdt22 like rdt21 but does not need NAKs

              Rdt30 Allows packets to be lost

              Rdt10 reliable transfer over a reliable channel

              underlying channel perfectly reliableno bit errorsno loss of packets

              separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

              Wait for call from above packet = make_pkt(data)

              udt_send(packet)

              rdt_send(data)extract (packetdata)deliver_data(data)

              Wait for call from

              below

              rdt_rcv(packet)

              sender receiver

              3 Transport Layer 24Comp 361 Spring 2005

              3 Transport Layer 25Comp 361 Spring 2005

              Rdt20 channel with bit errors

              underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

              the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

              new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

              3 Transport Layer 26Comp 361 Spring 2005

              rdt20 FSM specification

              Wait for call from above

              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

              udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

              udt_send(NAK)

              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

              Wait for ACK or

              NAK

              rdt_send(data)

              receiver

              Wait for call from

              below

              Λ

              sender

              3 Transport Layer 27Comp 361 Spring 2005

              rdt20 operation with no errors

              Wait for call from above

              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

              udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

              udt_send(NAK)

              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

              Wait for ACK or

              NAK

              Wait for call from

              below

              rdt_send(data)

              Λ

              3 Transport Layer 28Comp 361 Spring 2005

              rdt20 error scenario

              Wait for call from above

              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

              udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

              udt_send(NAK)

              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

              Wait for ACK or

              NAK

              Wait for call from

              below

              rdt_send(data)

              Λ

              3 Transport Layer 29Comp 361 Spring 2005

              rdt20 has a fatal flawWhat happens if ACKNAK

              corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

              What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

              Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

              Sender sends one packet then waits for receiver response

              stop and wait

              3 Transport Layer 30Comp 361 Spring 2005

              Sender whenever sender receives control message it sends a packet to receiver

              A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

              Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

              Note ACKNAK do not contain sequence

              3 Transport Layer 31Comp 361 Spring 2005

              rdt21 sender handles garbled ACKNAKs

              Wait for call 0 from

              above

              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

              rdt_send(data)

              Wait for ACK or NAK 0 udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

              rdt_send(data)

              udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

              Wait forcall 1 from

              above

              Wait for ACK or NAK 1

              ΛΛ

              3 Transport Layer 32Comp 361 Spring 2005

              rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

              ampamp has_seq0(rcvpkt)

              Wait for 0 from below

              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

              Wait for 1 from below

              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

              3 Transport Layer 33Comp 361 Spring 2005

              rdt21 discussion

              Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

              state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

              Receivermust check if received packet is duplicate

              state indicates whether 0 or 1 is expected pkt seq

              note receiver can notknow if its last ACKNAK received OK at sender

              3 Transport Layer 34Comp 361 Spring 2005

              rdt22 a NAK-free protocol

              same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

              receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

              duplicate ACK at sender results in same action as NAK retransmit current pkt

              3 Transport Layer 35Comp 361 Spring 2005

              rdt22 sender receiver fragments

              Wait for call 0 from

              above

              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

              rdt_send(data)

              udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

              isACK(rcvpkt1) )

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

              Wait for ACK

              0sender FSM

              fragment

              Wait for 0 from below

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

              has_seq1(rcvpkt))

              udt_send(sndpkt)receiver FSM

              fragment

              Λ

              3 Transport Layer 36Comp 361 Spring 2005

              rdt30 channels with errors and loss

              New assumptionunderlying channel can also lose packets (data or ACKs)

              checksum seq ACKs retransmissions will be of help but not enough

              Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

              Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

              retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

              requires countdown timer

              3 Transport Layer 37Comp 361 Spring 2005

              rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

              rdt_send(data)

              Wait for

              ACK0

              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

              Wait for call 1 from

              above

              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

              rdt_send(data)

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

              stop_timerstop_timer

              udt_send(sndpkt)start_timer

              timeout

              udt_send(sndpkt)start_timer

              timeout

              rdt_rcv(rcvpkt)

              Wait for call 0from

              above

              Wait for

              ACK1

              Λrdt_rcv(rcvpkt)

              ΛΛ

              Λ

              3 Transport Layer 38Comp 361 Spring 2005

              rdt30 in action

              3 Transport Layer 39Comp 361 Spring 2005

              rdt30 in action

              3 Transport Layer 40Comp 361 Spring 2005

              Performance of rdt30

              rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

              L (packet length in bits)R (transmission rate bps)

              8kbpkt109 bsec

              Ttransmit = = = 8 microsec

              U sender =

              00830008

              = 000027 L R RTT + L R

              =

              U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

              rdt30 stop-and-wait operation

              first packet bit transmitted t = 0

              sender receiver

              RTT

              last packet bit transmitted t = L R

              first packet bit arriveslast packet bit arrives send ACK

              ACK arrives send next packet t = RTT + L R

              U sender =

              008 30008

              = 000027 L R RTT + L R

              =

              3 Transport Layer 41Comp 361 Spring 2005

              3 Transport Layer 42Comp 361 Spring 2005

              Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

              range of sequence numbers must be increasedbuffering at sender andor receiver

              3 Transport Layer 43Comp 361 Spring 2005

              Pipelined protocols

              Advantage much better bandwidth utilization than stop-and-wait

              Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

              Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

              Note TCP is not exactly either

              Pipelining increased utilization

              first packet bit transmitted t = 0

              sender receiver

              RTT

              last bit transmitted t = L R

              first packet bit arriveslast packet bit arrives send ACK

              ACK arrives send next packet t = RTT + L R

              last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

              U sender =

              02430008

              = 00008 3 L R RTT + L R

              =

              Increase utilizationby a factor of 3

              3 Transport Layer 44Comp 361 Spring 2005

              3 Transport Layer 45Comp 361 Spring 2005

              Go-Back-NSender

              k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

              ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

              Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

              3 Transport Layer 46Comp 361 Spring 2005

              GBN Sender

              rdt_Send() called checks to see if window is full No send out packetYes return data to application level

              Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

              Timeout resends ALL packets that have been sent but not yet acknowledged

              This is only event that triggers resend

              3 Transport Layer 47Comp 361 Spring 2005

              GBN sender extended FSMrdt_send(data)

              Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

              timeout

              if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

              start_timernextseqnum++

              elserefuse_data(data)

              base = getacknum(rcvpkt)+1If (base == nextseqnum)

              stop_timerelse

              start_timer

              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

              base=1nextseqnum=1

              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

              Λ

              3 Transport Layer 48Comp 361 Spring 2005

              GBN receiver extended FSM

              Wait

              udt_send(sndpkt)default

              rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

              expectedseqnum=1sndpkt =

              make_pkt(0ACKchksum)

              Λ

              If expected packet receivedSend ACK and deliver packet upstairs

              If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

              3 Transport Layer 49Comp 361 Spring 2005

              More on receiver

              The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

              3 Transport Layer 50Comp 361 Spring 2005

              GBN inaction

              GBN is easy to code but might have performance problems

              In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

              Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

              3 Transport Layer 51Comp 361 Spring 2005

              3 Transport Layer 52Comp 361 Spring 2005

              Selective Repeat

              receiver individually acknowledges all correctly received pkts

              buffers pkts as needed for eventual in-order delivery to upper layer

              sender only resends pkts for which ACK not received

              sender timer for each unACKed pktCompare to GBN which only had timer for base packet

              sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

              3 Transport Layer 53Comp 361 Spring 2005

              Selective repeat sender receiver windows

              3 Transport Layer 54Comp 361 Spring 2005

              Selective repeat

              pkt n in [rcvbase rcvbase+N-1]

              send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

              pkt n in [rcvbase-Nrcvbase-1]

              ACK(n) (note this is a reACK)

              otherwiseignore

              receiverdata from above

              if next available seq in window send pkt

              timeout(n)resend pkt n restart timer

              ACK(n) in [sendbasesendbase+N]

              mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

              sender

              3 Transport Layer 55Comp 361 Spring 2005

              Selective repeat in action

              3 Transport Layer 56Comp 361 Spring 2005

              Selective repeatdilemma

              Example seq rsquos 0 1 2 3window size=3

              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

              Q what is relationship between seq size and window size

              3 Transport Layer 57Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 58Comp 361 Spring 2005

              TCP Overview RFCs 793 1122 1323 2018 2581

              full duplex databi-directional data flow in same connectionMSS maximum segment size

              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

              flow controlledsender will not overwhelm receiver

              point-to-pointone sender one receiver

              reliable in-order byte steam

              no ldquomessage boundariesrdquopipelined

              TCP congestion and flow control set window size

              send amp receive buffers

              socketdoor

              TCPsend buffer

              TCPreceive buffer

              socketdoor

              segment

              applicationwrites data

              applicationreads data

              3 Transport Layer 59Comp 361 Spring 2005

              More TCP DetailsMaximum Segment Size (MSS)

              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

              Application Data + TCP Header = TCP Segment

              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

              (again no payload)Client responds with third special segment

              This can contain payload

              3 Transport Layer 60Comp 361 Spring 2005

              Even More TCP Details

              A TCP connection between client and server creates in both client and server

              (i) buffers(ii) variables and

              (iii) a socket connection to process

              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

              any of the network elements between the host and server

              3 Transport Layer 61Comp 361 Spring 2005

              TCP segment structure

              source port dest port

              32 bits

              applicationdata

              (variable length)

              sequence numberacknowledgement number

              Receive windowUrg data pnterchecksum

              FSRPAUheadlen

              notused

              Options (variable length)

              URG urgent data (generally not used)

              ACK ACK valid

              PSH push data now(generally not used)

              RST SYN FINconnection estab(setup teardown

              commands)

              bytes rcvr willingto accept

              Internetchecksum

              (as in UDP)

              countingby bytes of data(not segments)

              3 Transport Layer 62Comp 361 Spring 2005

              TCP seq rsquos and ACKsSeq rsquos

              byte stream ldquonumberrdquo of first byte in segmentrsquos data

              ACKsseq of next byte expected from other sidecumulative ACK

              Q how receiver handles out-of-order segments

              A TCP spec doesnrsquot say - up to implementer

              Host BHost A

              Seq=42 ACK=79 data = lsquoCrsquo

              Seq=79 ACK=43 data = lsquoCrsquo

              Seq=43 ACK=80

              Usertypes

              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

              back lsquoCrsquo

              host ACKsreceipt

              of echoedlsquoCrsquo

              timesimple telnet scenario

              3 Transport Layer 63Comp 361 Spring 2005

              TCP Round Trip Time and Timeout

              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

              average several recent measurements not just current SampleRTT

              Q how to set TCP timeout valuelonger than RTT

              but RTT variestoo short premature timeout

              unnecessary retransmissions

              too long slow reaction to segment loss

              3 Transport Layer 64Comp 361 Spring 2005

              TCP Round Trip Time and Timeout

              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

              3 Transport Layer 65Comp 361 Spring 2005

              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

              100

              150

              200

              250

              300

              350

              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

              time (seconnds)

              RTT

              (mill

              iseco

              nds)

              SampleRTT Estimated RTT

              3 Transport Layer 66Comp 361 Spring 2005

              TCP Round Trip Time and Timeout

              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

              (typically β = 025)

              Then set timeout interval

              TimeoutInterval = EstimatedRTT + 4DevRTT

              3 Transport Layer 67Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 68Comp 361 Spring 2005

              TCP reliable data transfer

              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

              Retransmissions are triggered by

              timeout eventsduplicate acks

              Initially consider simplified TCP sender

              ignore duplicate acksignore flow control congestion control

              3 Transport Layer 69Comp 361 Spring 2005

              TCP sender eventsdata rcvd from app

              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

              timeoutretransmit segment that caused timeoutrestart timer

              Ack rcvdIf acknowledges previously unackedsegments

              update what is known to be ackedstart timer if there are outstanding segments

              TCP sender(simplified)

              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

              loop (forever) switch(event)

              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

              event timer timeoutretransmit not-yet-acknowledged segment with

              smallest sequence numberstart timer

              event ACK received with ACK field value of y if (y gt SendBase)

              SendBase = yif (there are currently not-yet-acknowledged segments)

              start timer

              end of loop forever

              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

              3 Transport Layer 70Comp 361 Spring 2005

              3 Transport Layer 71Comp 361 Spring 2005

              TCP retransmission scenariosHost A

              Seq=100 20 bytes data

              ACK=100

              timepremature timeout

              Host B

              Seq=92 8 bytes data

              ACK=120

              Seq=92 8 bytes data

              Seq=

              92 t

              imeo

              ut

              ACK=120

              Host A

              Seq=92 8 bytes data

              ACK=100

              loss

              tim

              eout

              lost ACK scenario

              Host B

              X

              Seq=92 8 bytes data

              ACK=100

              time

              SendBase= 120

              SendBase= 120

              Sendbase= 100

              Seq=

              92 t

              imeo

              utSendBase

              = 100

              3 Transport Layer 72Comp 361 Spring 2005

              TCP retransmission scenarios (more)Host A

              Seq=92 8 bytes data

              ACK=100

              loss

              tim

              eout

              Cumulative ACK scenario

              Host B

              X

              Seq=100 20 bytes data

              ACK=120

              time

              SendBase= 120

              3 Transport Layer 73Comp 361 Spring 2005

              TCP ACK generation [RFC 1122 RFC 2581]

              Event at Receiver

              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

              Arrival of in-order segment withexpected seq One other segment has ACK pending

              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

              Arrival of segment that partially or completely fills gap

              TCP Receiver action

              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

              Immediately send single cumulative ACK ACKing both in-order segments

              Immediately send duplicate ACK indicating seq of next expected byte

              Immediate send ACK provided thatsegment starts at lower end of gap

              3 Transport Layer 74Comp 361 Spring 2005

              More on Sender Policies

              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

              3 Transport Layer 75Comp 361 Spring 2005

              Fast Retransmit

              Time-out period often relatively long

              long delay before resending lost packet

              Detect lost segments via duplicate ACKs

              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

              fast retransmit resend segment before timer expires

              3 Transport Layer 76Comp 361 Spring 2005

              Fast retransmit algorithm

              event ACK received with ACK field value of y if (y gt SendBase)

              SendBase = yif (there are currently not-yet-acknowledged segments)

              start timer

              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

              resend segment with sequence number y

              a duplicate ACK for already ACKed segment

              fast retransmit

              3 Transport Layer 77Comp 361 Spring 2005

              TCP GBN or Selective Repeat

              Basic TCP looks a lot like GBN

              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

              This looks a lot like Selective Repeat

              TCP is a hybrid

              3 Transport Layer 78Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 79Comp 361 Spring 2005

              TCP Flow Control

              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

              3 Transport Layer 80Comp 361 Spring 2005

              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

              transmitting too muchtoo fast

              flow controlreceive side of TCP connection has a receive buffer

              speed-matching service matching the send rate to the receiving apprsquos drain rate

              app process may be slow at reading from buffer

              3 Transport Layer 81Comp 361 Spring 2005

              TCP segment structure

              source port dest port

              32 bits

              applicationdata

              (variable length)

              sequence numberacknowledgement number

              Receive windowUrg data pnterchecksum

              FSRPAUheadlen

              notused

              Options (variable length)

              URG urgent data (generally not used)

              ACK ACK valid

              PSH push data now(generally not used)

              RST SYN FINconnection estab(setup teardown

              commands)

              bytes rcvr willingto accept

              Internetchecksum

              (as in UDP)

              countingby bytes of data(not segments)

              3 Transport Layer 82Comp 361 Spring 2005

              TCP Flow control how it works

              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

              = RcvWindow= RcvBuffer-[LastByteRcvd -

              LastByteRead]

              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

              guarantees receive buffer doesnrsquot overflow

              3 Transport Layer 83Comp 361 Spring 2005

              Technical Issue

              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

              3 Transport Layer 84Comp 361 Spring 2005

              Note on UDP

              UDP has no flow control

              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

              3 Transport Layer 85Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 86Comp 361 Spring 2005

              TCP Connection Management

              Three way handshakeStep 1 client end system sends

              TCP SYN control segment to server

              specifies client_isn the initial seq No application data

              Step 2 server end system receives SYN replies with SYNACK control segment

              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

              seq sbuffers flow control info (eg RcvWindow)

              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

              3 Transport Layer 87Comp 361 Spring 2005

              TCP Connection Management (cont)

              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

              Allocate buffersAllocates buffersCan include application data

              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

              clientConnection request (SYN=1 seq=client_isn)

              server

              Connection granted (SYN=1 server_isn

              ACK (SYN=0 seq=client_isn+1)

              ack=client_isn+1)

              ack=server_isn+1

              3 Transport Layer 88Comp 361 Spring 2005

              TCP Connection Management (cont)

              Closing a connection

              client closes socketclientSocketclose()

              Step 1 client end system sends TCP FIN control segment to server

              Step 2 server receives FIN replies with ACK Closes connection sends FIN

              client

              FIN

              server

              ACK

              ACK

              FIN

              close

              close

              closed

              tim

              ed w

              ait

              3 Transport Layer 89Comp 361 Spring 2005

              TCP Connection Management (cont)

              Step 3 client receives FIN replies with ACK

              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

              Closes down after timed-wait

              Step 4 server receives ACK Connection closed

              Note with small modification can handle simultaneous FINs

              client

              FIN

              server

              ACK

              ACK

              FIN

              closing

              closing

              closed

              tim

              ed w

              ait

              closed

              3 Transport Layer 90Comp 361 Spring 2005

              TCP Connection Management (cont)

              ExampleTCP serverlifecycle

              Example TCP clientlifecycle

              3 Transport Layer 91Comp 361 Spring 2005

              A few special cases

              Have not discussed what happens if both client and server decide to close down connection at same time

              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

              3 Transport Layer 92Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 93Comp 361 Spring 2005

              Principles of Congestion Control

              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

              a top-10 problem

              3 Transport Layer 94Comp 361 Spring 2005

              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

              large delays when congestedmaximum achievable throughput

              3 Transport Layer 95Comp 361 Spring 2005

              Causescosts of congestion scenario 2

              one router finite buffers sender retransmission of lost packet

              3 Transport Layer 96Comp 361 Spring 2005

              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

              λin λout=

              λin λoutgtλ

              inλout

              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

              (c)(a) (b)

              3 Transport Layer 97Comp 361 Spring 2005

              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

              λin

              Q what happens as and increase λ

              in

              3 Transport Layer 98Comp 361 Spring 2005

              Causescosts of congestion scenario 3

              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

              3 Transport Layer 99Comp 361 Spring 2005

              Approaches towards congestion control

              Two broad approaches towards congestion control

              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

              Network-assisted congestion controlrouters provide feedback to end systems

              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

              3 Transport Layer 100Comp 361 Spring 2005

              Case study ATM ABR congestion control

              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

              RM cells returned to sender by receiver with bits intact

              small exception ndash see next page

              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

              sender should use available bandwidth

              if senderrsquos path congested sender throttled to minimum guaranteed rate

              3 Transport Layer 101Comp 361 Spring 2005

              Case study ATM ABR congestion control

              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

              3 Transport Layer 102Comp 361 Spring 2005

              Chapter 3 outline

              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

              35 Connection-oriented transport TCP

              segment structurereliable data transferflow controlconnection management

              36 Principles of congestion control37 TCP congestion control

              3 Transport Layer 103Comp 361 Spring 2005

              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

              Congwin

              w segments each with MSS bytes sent in one RTT

              throughput = w MSSRTT Bytessec

              3 Transport Layer 104Comp 361 Spring 2005

              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

              Tools are ldquosimilarrdquo to flow control sender limits transmission using

              LastByteSent-LastByteAcked le CongWin

              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

              3 Transport Layer 105Comp 361 Spring 2005

              TCP AIMDmultiplicative decrease additive increase increase

              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

              cut CongWin in half after loss event

              8 Kbytes

              16 Kbytes

              24 Kbytes

              time

              congestionwindow

              Long-lived TCP connection

              3 Transport Layer 106Comp 361 Spring 2005

              TCP Slow Start

              When connection begins CongWin = 1 MSS

              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

              available bandwidth may be gtgt MSSRTT

              desirable to quickly ramp up to respectable rate

              When connection begins increase rate exponentially fast until first loss event

              3 Transport Layer 107Comp 361 Spring 2005

              TCP Slow Start (more)

              When connection begins increase rate exponentially until first loss event

              double CongWin every RTTdone by incrementing CongWin for every ACK received

              Summary initial rate is slow but ramps up exponentially fast

              Host A

              one segment

              RTT

              Host B

              time

              two segments

              four segments

              3 Transport Layer 108Comp 361 Spring 2005

              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

              3 Transport Layer 109Comp 361 Spring 2005

              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

              3 Transport Layer 110Comp 361 Spring 2005

              Summary TCP Congestion Control

              When CongWin is below Threshold sender in slow-start phase window grows exponentially

              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

              3 Transport Layer 111Comp 361 Spring 2005

              The Big Picture

              3 Transport Layer 112Comp 361 Spring 2005

              TCP sender congestion controlEvent State TCP Sender Action Commentary

              ACK receipt for previously unackeddata

              Slow Start (SS)

              CongWin = CongWin + MSS If (CongWin gt Threshold)

              set state to ldquoCongestion Avoidancerdquo

              Resulting in a doubling of CongWin every RTT

              ACK receipt for previously unackeddata

              CongestionAvoidance (CA)

              CongWin = CongWin+MSS (MSSCongWin)

              Additive increase resulting in increase of CongWin by 1 MSS every RTT

              Loss event detected by triple duplicate ACK

              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

              Enter slow start

              Duplicate ACK

              SS or CA Increment duplicate ACK count for segment being acked

              CongWin and Threshold not changed

              3 Transport Layer 113Comp 361 Spring 2005

              TCP throughput

              Whatrsquos the average throughput of TCP as a function of window size and RTT

              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

              3 Transport Layer 114Comp 361 Spring 2005

              TCP Futures

              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

              L = 210-10 WowNew versions of TCP for high-speed needed

              LRTTMSSsdot221

              3 Transport Layer 115Comp 361 Spring 2005

              TCP FairnessFairness goal if K TCP sessions share same

              bottleneck link of bandwidth R each should have average rate of RK

              TCP connection 1

              bottleneckrouter

              capacity R

              TCP connection 2

              3 Transport Layer 116Comp 361 Spring 2005

              Why is TCP fairTwo competing sessions

              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

              R

              R

              equal bandwidth share

              Connection 1 throughput

              Conn

              ecti

              on 2

              thr

              ough

              p ut

              congestion avoidance additive increaseloss decrease window by factor of 2

              congestion avoidance additive increaseloss decrease window by factor of 2

              3 Transport Layer 117Comp 361 Spring 2005

              Fairness (more)Fairness and UDP

              Multimedia apps often do not use TCP

              do not want rate throttled by congestion control

              Instead use UDPpump audiovideo at constant rate tolerate packet loss

              Current Research area How to keep UDP from congesting the internet

              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

              3 Transport Layer 118Comp 361 Spring 2005

              TCP Latency ModelingNotation assumptions

              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

              modeling slow start

              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

              3 Transport Layer 119Comp 361 Spring 2005

              Fixed Congestion Window (W)Two cases

              1 WSR gt RTT + SR ACK for first segment in window returns before

              windowrsquos worth of data sentLatency = 2RTT + OR

              2 WSR lt RTT + SR ACK for first segment in window returns after

              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

              3 Transport Layer 120Comp 361 Spring 2005

              Fixed congestion window (1)

              First caseWSR gt RTT + SR ACK for

              first segment in window returns before windowrsquos worth of data sent

              latency = 2RTT + OR

              3 Transport Layer 121Comp 361 Spring 2005

              Fixed congestion window (2)

              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

              3 Transport Layer 122Comp 361 Spring 2005

              TCP Latency Modeling Slow Start (1)

              Now suppose window grows according to slow start(with no threshold and no loss events)

              Will show that the delay for one object is

              RS

              RSRTTP

              RORTTLatency P )12(2 minusminus⎥⎦

              ⎤⎢⎣⎡ +++=

              where P is the number of times TCP idles at server1min minus= KQP

              - where Q is the number of times the server idlesif the object were of infinite size

              - and K is the number of windows that cover the object

              3 Transport Layer 123Comp 361 Spring 2005

              TCP Latency Modeling Slow Start (2)

              RTT

              initiate TCPconnection

              requestobject

              first window= SR

              second window= 2SR

              third window= 4SR

              fourth window= 8SR

              completetransmissionobject

              delivered

              time atclient

              time atserver

              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

              Server idles P=2 times

              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

              Server idles P = minK-1Q times

              3 Transport Layer 124Comp 361 Spring 2005

              TCP Latency Modeling (3)

              ementacknowledg receivesserver until

              segment send tostartsserver whenfrom time=+ RTTRS

              RS

              RSRTTPRTT

              RO

              RSRTT

              RSRTT

              RO

              idleTimeRTTRO

              P

              kP

              k

              P

              pp

              )12(][2

              ]2[2

              2delay

              1

              1

              1

              minusminus+++=

              minus+++=

              ++=

              minus

              =

              =

              sum

              sum

              th window after the timeidle 2 1 kRSRTT

              RS k =⎥⎦

              ⎤⎢⎣⎡ minus+

              +minus

              window kth the transmit totime2 1 =minus

              RSk

              RTT

              initiate TCPconnection

              requestobject

              first window= SR

              second window= 2SR

              third window= 4SR

              fourth window= 8SR

              completetransmissionobject

              delivered

              time atclient

              time atserver

              3 Transport Layer 125Comp 361 Spring 2005

              TCP Latency Modeling (4)Recall K = number of windows that cover object

              How do we calculate K

              ⎥⎥⎤

              ⎢⎢⎡ +=

              +ge=

              geminus=

              ge+++=

              ge+++=minus

              minus

              )1(log

              )1(logmin

              12min

              222min222min

              2

              2

              110

              110

              SO

              SOkk

              SOk

              SOkOSSSkK

              k

              k

              k

              L

              L

              Calculation of Q number of idles for infinite-size objectis similar

              3 Transport Layer 126Comp 361 Spring 2005

              HTTP ModelingAssume Web page consists of

              1 base HTML page (of size O bits)M images (each of size O bits)

              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

              3 Transport Layer 127Comp 361 Spring 2005

              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

              02468

              101214161820

              28Kbps

              100Kbps

              1 Mbps 10Mbps

              non-persistent

              persistent

              parallel non-persistent

              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

              3 Transport Layer 128Comp 361 Spring 2005

              HTTP Response time (in seconds)

              0

              10

              20

              30

              40

              50

              60

              70

              28Kbps

              100Kbps

              1 Mbps 10Mbps

              non-persistent

              persistent

              parallel non-persistent

              RTT =1 sec O = 5 Kbytes M=10 and X=5

              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

              3 Transport Layer 129Comp 361 Spring 2005

              Chapter 3 Summaryprinciples behind transport layer services

              multiplexing demultiplexingreliable data transferflow controlcongestion control

              instantiation and implementation in the Internet

              UDPTCP

              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

              • Chapter 3 Transport Layer last revised 160305
              • Chapter 3 outline
              • Transport services and protocols
              • Transport vs network layer
              • Transport-layer protocols
              • Chapter 3 outline
              • Multiplexingdemultiplexing
              • Multiplexingdemultiplexing
              • How demultiplexing works
              • Connectionless demultiplexing
              • Connectionless demux (cont)
              • Connection-oriented demux
              • Connection-oriented demux (cont)
              • Connection-oriented demux Threaded Web Server
              • Chapter 3 outline
              • UDP User Datagram Protocol [RFC 768]
              • UDP more
              • UDP checksum
              • Chapter 3 outline
              • Principles of Reliable data transfer
              • Reliable data transfer getting started
              • Reliable data transfer getting started
              • Incremental Improvements
              • Rdt10 reliable transfer over a reliable channel
              • Rdt20 channel with bit errors
              • rdt20 FSM specification
              • rdt20 operation with no errors
              • rdt20 error scenario
              • rdt20 has a fatal flaw
              • rdt21 sender handles garbled ACKNAKs
              • rdt21 receiver handles garbled ACKNAKs
              • rdt21 discussion
              • rdt22 a NAK-free protocol
              • rdt22 sender receiver fragments
              • rdt30 channels with errors and loss
              • rdt30 sender
              • rdt30 in action
              • rdt30 in action
              • Performance of rdt30
              • rdt30 stop-and-wait operation
              • Pipelined protocols
              • Pipelined protocols
              • Pipelining increased utilization
              • Go-Back-N
              • GBN Sender
              • GBN sender extended FSM
              • GBN receiver extended FSM
              • More on receiver
              • GBN inaction
              • Selective Repeat
              • Selective repeat sender receiver windows
              • Selective repeat
              • Selective repeat in action
              • Selective repeat dilemma
              • Chapter 3 outline
              • TCP Overview RFCs 793 1122 1323 2018 2581
              • More TCP Details
              • Even More TCP Details
              • TCP segment structure
              • TCP seq rsquos and ACKs
              • TCP Round Trip Time and Timeout
              • TCP Round Trip Time and Timeout
              • Example RTT estimation
              • TCP Round Trip Time and Timeout
              • Chapter 3 outline
              • TCP reliable data transfer
              • TCP sender events
              • TCP sender(simplified)
              • TCP retransmission scenarios
              • TCP retransmission scenarios (more)
              • TCP ACK generation [RFC 1122 RFC 2581]
              • More on Sender Policies
              • Fast Retransmit
              • Fast retransmit algorithm
              • TCP GBN or Selective Repeat
              • Chapter 3 outline
              • TCP Flow Control
              • TCP Flow Control
              • TCP segment structure
              • TCP Flow control how it works
              • Technical Issue
              • Chapter 3 outline
              • TCP Connection Management
              • TCP Connection Management (cont)
              • TCP Connection Management (cont)
              • TCP Connection Management (cont)
              • TCP Connection Management (cont)
              • A few special cases
              • Chapter 3 outline
              • Principles of Congestion Control
              • Causescosts of congestion scenario 1
              • Causescosts of congestion scenario 2
              • Causescosts of congestion scenario 3
              • Causescosts of congestion scenario 3
              • Approaches towards congestion control
              • Case study ATM ABR congestion control
              • Case study ATM ABR congestion control
              • Chapter 3 outline
              • TCP Congestion Control
              • TCP AIMD
              • TCP Slow Start
              • TCP Slow Start (more)
              • Summary TCP Congestion Control
              • The Big Picture
              • TCP sender congestion control
              • TCP throughput
              • TCP Futures
              • TCP Fairness
              • Why is TCP fair
              • Fairness (more)
              • TCP Latency Modeling
              • Fixed Congestion Window (W)
              • Fixed congestion window (1)
              • Fixed congestion window (2)
              • TCP Latency Modeling Slow Start (1)
              • TCP Latency Modeling Slow Start (2)
              • TCP Latency Modeling (3)
              • TCP Latency Modeling (4)
              • HTTP Modeling
              • Chapter 3 Summary

                3 Transport Layer 8Comp 361 Spring 2005

                Multiplexingdemultiplexingsegment - unit of data

                exchanged between transport layer entities

                aka TPDU transport protocol data unit

                Demultiplexing delivering received segments to correct app layer processes

                receiver

                applicationtransportnetwork

                M P2applicationtransportnetwork

                HtHn segment

                segment Mapplicationtransportnetwork

                P1M

                M MP3 P4

                segmentheader

                application-layerdata

                3 Transport Layer 9Comp 361 Spring 2005

                How demultiplexing workshost receives IP datagrams

                each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

                host uses IP addresses amp port numbers to direct segment to appropriate socket

                source port dest port

                32 bits

                applicationdata

                (message)

                other header fields

                TCPUDP segment format

                3 Transport Layer 10Comp 361 Spring 2005

                Connectionless demultiplexingWhen host receives UDP segment

                checks destination port number in segmentdirects UDP segment to socket with that port number

                IP datagrams with different source IP addresses andor source port numbers directed to same socket

                Create sockets with port numbers

                DatagramSocket mySocket1 = new DatagramSocket(99111)

                DatagramSocket mySocket2 = new DatagramSocket(99222)

                UDP socket identified by two-tuple

                (dest IP address dest port number)

                3 Transport Layer 11Comp 361 Spring 2005

                Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

                ClientIPB

                P3

                clientIP A

                P1P1P3

                serverIP C

                SP 6428DP 9157

                SP 9157DP 6428

                SP 6428DP 5775

                SP 5775DP 6428

                SP provides ldquoreturn addressrdquo

                3 Transport Layer 12Comp 361 Spring 2005

                Connection-oriented demux

                TCP socket identified by 4-tuple

                source IP addresssource port numberdest IP addressdest port number

                recv host uses all four values to direct segment to appropriate socket

                Server host may support many simultaneous TCP sockets

                each socket identified by its own 4-tuple

                Web servers have different sockets for each connecting client

                non-persistent HTTP will have different socket for each request

                3 Transport Layer 13Comp 361 Spring 2005

                Connection-oriented demux(cont)

                ClientIPB

                P3

                clientIP A

                P1P1P3

                serverIP C

                SP 80DP 9157

                SP 9157DP 80

                SP 80DP 5775

                SP 5775DP 80

                P4

                3 Transport Layer 14Comp 361 Spring 2005

                Connection-oriented demux Threaded Web Server

                ClientIPB

                P1

                clientIP A

                P1P2

                serverIP C

                SP 9157DP 80

                SP 9157DP 80

                P4 P3

                D-IPCS-IP AD-IPC

                S-IP B

                SP 5775DP 80

                D-IPCS-IP B

                3 Transport Layer 15Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 16Comp 361 Spring 2005

                UDP User Datagram Protocol [RFC 768]

                ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                lostdelivered out of order to app

                connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                3 Transport Layer 17Comp 361 Spring 2005

                UDP moreoften used for streaming multimedia apps

                loss tolerantrate sensitive

                other UDP uses (why)

                DNS small delaySNMP stressful cond

                reliable transfer over UDP add reliability at application layer

                application-specific error recover

                source port dest port

                32 bits

                Applicationdata

                (message)

                length checksumLength in

                bytes of UDPsegmentincluding

                header

                UDP segment format

                3 Transport Layer 18Comp 361 Spring 2005

                UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                segment

                Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                NO - error detectedYES - no error detected But maybe errors nonetheless More later

                Receiver may choose to discard segment or send a warning to app in case error

                Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                3 Transport Layer 19Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 20Comp 361 Spring 2005

                Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                3 Transport Layer 21Comp 361 Spring 2005

                Reliable data transfer getting started

                sendside

                receiveside

                rdt_send() called from above (eg by app) Passed data to

                deliver to receiver upper layer

                udt_send() called by rdtto transfer packet over

                unreliable channel to receiver

                rdt_rcv() called when packet arrives on rcv-side of channel

                deliver_data() called by rdt to deliver data to upper

                3 Transport Layer 22Comp 361 Spring 2005

                Reliable data transfer getting startedWersquoll

                incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                state1

                state2

                event causing state transitionactions taken on state transition

                state when in this ldquostaterdquo next state

                uniquely determined by next event

                eventactions

                3 Transport Layer 23Comp 361 Spring 2005

                Incremental Improvements

                rdt10 assumes every packet sent arrives and no errors introduced in transmission

                rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                rdt21 deals with corrupted ACKSNAKS

                rdt22 like rdt21 but does not need NAKs

                Rdt30 Allows packets to be lost

                Rdt10 reliable transfer over a reliable channel

                underlying channel perfectly reliableno bit errorsno loss of packets

                separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                Wait for call from above packet = make_pkt(data)

                udt_send(packet)

                rdt_send(data)extract (packetdata)deliver_data(data)

                Wait for call from

                below

                rdt_rcv(packet)

                sender receiver

                3 Transport Layer 24Comp 361 Spring 2005

                3 Transport Layer 25Comp 361 Spring 2005

                Rdt20 channel with bit errors

                underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                3 Transport Layer 26Comp 361 Spring 2005

                rdt20 FSM specification

                Wait for call from above

                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                udt_send(NAK)

                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                Wait for ACK or

                NAK

                rdt_send(data)

                receiver

                Wait for call from

                below

                Λ

                sender

                3 Transport Layer 27Comp 361 Spring 2005

                rdt20 operation with no errors

                Wait for call from above

                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                udt_send(NAK)

                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                Wait for ACK or

                NAK

                Wait for call from

                below

                rdt_send(data)

                Λ

                3 Transport Layer 28Comp 361 Spring 2005

                rdt20 error scenario

                Wait for call from above

                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                udt_send(NAK)

                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                Wait for ACK or

                NAK

                Wait for call from

                below

                rdt_send(data)

                Λ

                3 Transport Layer 29Comp 361 Spring 2005

                rdt20 has a fatal flawWhat happens if ACKNAK

                corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                Sender sends one packet then waits for receiver response

                stop and wait

                3 Transport Layer 30Comp 361 Spring 2005

                Sender whenever sender receives control message it sends a packet to receiver

                A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                Note ACKNAK do not contain sequence

                3 Transport Layer 31Comp 361 Spring 2005

                rdt21 sender handles garbled ACKNAKs

                Wait for call 0 from

                above

                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                rdt_send(data)

                Wait for ACK or NAK 0 udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                rdt_send(data)

                udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                Wait forcall 1 from

                above

                Wait for ACK or NAK 1

                ΛΛ

                3 Transport Layer 32Comp 361 Spring 2005

                rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                ampamp has_seq0(rcvpkt)

                Wait for 0 from below

                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                Wait for 1 from below

                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                3 Transport Layer 33Comp 361 Spring 2005

                rdt21 discussion

                Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                Receivermust check if received packet is duplicate

                state indicates whether 0 or 1 is expected pkt seq

                note receiver can notknow if its last ACKNAK received OK at sender

                3 Transport Layer 34Comp 361 Spring 2005

                rdt22 a NAK-free protocol

                same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                duplicate ACK at sender results in same action as NAK retransmit current pkt

                3 Transport Layer 35Comp 361 Spring 2005

                rdt22 sender receiver fragments

                Wait for call 0 from

                above

                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                rdt_send(data)

                udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                isACK(rcvpkt1) )

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                Wait for ACK

                0sender FSM

                fragment

                Wait for 0 from below

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                has_seq1(rcvpkt))

                udt_send(sndpkt)receiver FSM

                fragment

                Λ

                3 Transport Layer 36Comp 361 Spring 2005

                rdt30 channels with errors and loss

                New assumptionunderlying channel can also lose packets (data or ACKs)

                checksum seq ACKs retransmissions will be of help but not enough

                Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                requires countdown timer

                3 Transport Layer 37Comp 361 Spring 2005

                rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                rdt_send(data)

                Wait for

                ACK0

                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                Wait for call 1 from

                above

                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                rdt_send(data)

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                stop_timerstop_timer

                udt_send(sndpkt)start_timer

                timeout

                udt_send(sndpkt)start_timer

                timeout

                rdt_rcv(rcvpkt)

                Wait for call 0from

                above

                Wait for

                ACK1

                Λrdt_rcv(rcvpkt)

                ΛΛ

                Λ

                3 Transport Layer 38Comp 361 Spring 2005

                rdt30 in action

                3 Transport Layer 39Comp 361 Spring 2005

                rdt30 in action

                3 Transport Layer 40Comp 361 Spring 2005

                Performance of rdt30

                rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                L (packet length in bits)R (transmission rate bps)

                8kbpkt109 bsec

                Ttransmit = = = 8 microsec

                U sender =

                00830008

                = 000027 L R RTT + L R

                =

                U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                rdt30 stop-and-wait operation

                first packet bit transmitted t = 0

                sender receiver

                RTT

                last packet bit transmitted t = L R

                first packet bit arriveslast packet bit arrives send ACK

                ACK arrives send next packet t = RTT + L R

                U sender =

                008 30008

                = 000027 L R RTT + L R

                =

                3 Transport Layer 41Comp 361 Spring 2005

                3 Transport Layer 42Comp 361 Spring 2005

                Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                range of sequence numbers must be increasedbuffering at sender andor receiver

                3 Transport Layer 43Comp 361 Spring 2005

                Pipelined protocols

                Advantage much better bandwidth utilization than stop-and-wait

                Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                Note TCP is not exactly either

                Pipelining increased utilization

                first packet bit transmitted t = 0

                sender receiver

                RTT

                last bit transmitted t = L R

                first packet bit arriveslast packet bit arrives send ACK

                ACK arrives send next packet t = RTT + L R

                last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                U sender =

                02430008

                = 00008 3 L R RTT + L R

                =

                Increase utilizationby a factor of 3

                3 Transport Layer 44Comp 361 Spring 2005

                3 Transport Layer 45Comp 361 Spring 2005

                Go-Back-NSender

                k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                3 Transport Layer 46Comp 361 Spring 2005

                GBN Sender

                rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                Timeout resends ALL packets that have been sent but not yet acknowledged

                This is only event that triggers resend

                3 Transport Layer 47Comp 361 Spring 2005

                GBN sender extended FSMrdt_send(data)

                Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                timeout

                if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                start_timernextseqnum++

                elserefuse_data(data)

                base = getacknum(rcvpkt)+1If (base == nextseqnum)

                stop_timerelse

                start_timer

                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                base=1nextseqnum=1

                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                Λ

                3 Transport Layer 48Comp 361 Spring 2005

                GBN receiver extended FSM

                Wait

                udt_send(sndpkt)default

                rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                expectedseqnum=1sndpkt =

                make_pkt(0ACKchksum)

                Λ

                If expected packet receivedSend ACK and deliver packet upstairs

                If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                3 Transport Layer 49Comp 361 Spring 2005

                More on receiver

                The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                3 Transport Layer 50Comp 361 Spring 2005

                GBN inaction

                GBN is easy to code but might have performance problems

                In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                3 Transport Layer 51Comp 361 Spring 2005

                3 Transport Layer 52Comp 361 Spring 2005

                Selective Repeat

                receiver individually acknowledges all correctly received pkts

                buffers pkts as needed for eventual in-order delivery to upper layer

                sender only resends pkts for which ACK not received

                sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                3 Transport Layer 53Comp 361 Spring 2005

                Selective repeat sender receiver windows

                3 Transport Layer 54Comp 361 Spring 2005

                Selective repeat

                pkt n in [rcvbase rcvbase+N-1]

                send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                pkt n in [rcvbase-Nrcvbase-1]

                ACK(n) (note this is a reACK)

                otherwiseignore

                receiverdata from above

                if next available seq in window send pkt

                timeout(n)resend pkt n restart timer

                ACK(n) in [sendbasesendbase+N]

                mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                sender

                3 Transport Layer 55Comp 361 Spring 2005

                Selective repeat in action

                3 Transport Layer 56Comp 361 Spring 2005

                Selective repeatdilemma

                Example seq rsquos 0 1 2 3window size=3

                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                Q what is relationship between seq size and window size

                3 Transport Layer 57Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 58Comp 361 Spring 2005

                TCP Overview RFCs 793 1122 1323 2018 2581

                full duplex databi-directional data flow in same connectionMSS maximum segment size

                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                flow controlledsender will not overwhelm receiver

                point-to-pointone sender one receiver

                reliable in-order byte steam

                no ldquomessage boundariesrdquopipelined

                TCP congestion and flow control set window size

                send amp receive buffers

                socketdoor

                TCPsend buffer

                TCPreceive buffer

                socketdoor

                segment

                applicationwrites data

                applicationreads data

                3 Transport Layer 59Comp 361 Spring 2005

                More TCP DetailsMaximum Segment Size (MSS)

                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                Application Data + TCP Header = TCP Segment

                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                (again no payload)Client responds with third special segment

                This can contain payload

                3 Transport Layer 60Comp 361 Spring 2005

                Even More TCP Details

                A TCP connection between client and server creates in both client and server

                (i) buffers(ii) variables and

                (iii) a socket connection to process

                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                any of the network elements between the host and server

                3 Transport Layer 61Comp 361 Spring 2005

                TCP segment structure

                source port dest port

                32 bits

                applicationdata

                (variable length)

                sequence numberacknowledgement number

                Receive windowUrg data pnterchecksum

                FSRPAUheadlen

                notused

                Options (variable length)

                URG urgent data (generally not used)

                ACK ACK valid

                PSH push data now(generally not used)

                RST SYN FINconnection estab(setup teardown

                commands)

                bytes rcvr willingto accept

                Internetchecksum

                (as in UDP)

                countingby bytes of data(not segments)

                3 Transport Layer 62Comp 361 Spring 2005

                TCP seq rsquos and ACKsSeq rsquos

                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                ACKsseq of next byte expected from other sidecumulative ACK

                Q how receiver handles out-of-order segments

                A TCP spec doesnrsquot say - up to implementer

                Host BHost A

                Seq=42 ACK=79 data = lsquoCrsquo

                Seq=79 ACK=43 data = lsquoCrsquo

                Seq=43 ACK=80

                Usertypes

                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                back lsquoCrsquo

                host ACKsreceipt

                of echoedlsquoCrsquo

                timesimple telnet scenario

                3 Transport Layer 63Comp 361 Spring 2005

                TCP Round Trip Time and Timeout

                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                average several recent measurements not just current SampleRTT

                Q how to set TCP timeout valuelonger than RTT

                but RTT variestoo short premature timeout

                unnecessary retransmissions

                too long slow reaction to segment loss

                3 Transport Layer 64Comp 361 Spring 2005

                TCP Round Trip Time and Timeout

                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                3 Transport Layer 65Comp 361 Spring 2005

                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                100

                150

                200

                250

                300

                350

                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                time (seconnds)

                RTT

                (mill

                iseco

                nds)

                SampleRTT Estimated RTT

                3 Transport Layer 66Comp 361 Spring 2005

                TCP Round Trip Time and Timeout

                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                (typically β = 025)

                Then set timeout interval

                TimeoutInterval = EstimatedRTT + 4DevRTT

                3 Transport Layer 67Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 68Comp 361 Spring 2005

                TCP reliable data transfer

                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                Retransmissions are triggered by

                timeout eventsduplicate acks

                Initially consider simplified TCP sender

                ignore duplicate acksignore flow control congestion control

                3 Transport Layer 69Comp 361 Spring 2005

                TCP sender eventsdata rcvd from app

                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                timeoutretransmit segment that caused timeoutrestart timer

                Ack rcvdIf acknowledges previously unackedsegments

                update what is known to be ackedstart timer if there are outstanding segments

                TCP sender(simplified)

                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                loop (forever) switch(event)

                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                event timer timeoutretransmit not-yet-acknowledged segment with

                smallest sequence numberstart timer

                event ACK received with ACK field value of y if (y gt SendBase)

                SendBase = yif (there are currently not-yet-acknowledged segments)

                start timer

                end of loop forever

                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                3 Transport Layer 70Comp 361 Spring 2005

                3 Transport Layer 71Comp 361 Spring 2005

                TCP retransmission scenariosHost A

                Seq=100 20 bytes data

                ACK=100

                timepremature timeout

                Host B

                Seq=92 8 bytes data

                ACK=120

                Seq=92 8 bytes data

                Seq=

                92 t

                imeo

                ut

                ACK=120

                Host A

                Seq=92 8 bytes data

                ACK=100

                loss

                tim

                eout

                lost ACK scenario

                Host B

                X

                Seq=92 8 bytes data

                ACK=100

                time

                SendBase= 120

                SendBase= 120

                Sendbase= 100

                Seq=

                92 t

                imeo

                utSendBase

                = 100

                3 Transport Layer 72Comp 361 Spring 2005

                TCP retransmission scenarios (more)Host A

                Seq=92 8 bytes data

                ACK=100

                loss

                tim

                eout

                Cumulative ACK scenario

                Host B

                X

                Seq=100 20 bytes data

                ACK=120

                time

                SendBase= 120

                3 Transport Layer 73Comp 361 Spring 2005

                TCP ACK generation [RFC 1122 RFC 2581]

                Event at Receiver

                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                Arrival of in-order segment withexpected seq One other segment has ACK pending

                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                Arrival of segment that partially or completely fills gap

                TCP Receiver action

                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                Immediately send single cumulative ACK ACKing both in-order segments

                Immediately send duplicate ACK indicating seq of next expected byte

                Immediate send ACK provided thatsegment starts at lower end of gap

                3 Transport Layer 74Comp 361 Spring 2005

                More on Sender Policies

                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                3 Transport Layer 75Comp 361 Spring 2005

                Fast Retransmit

                Time-out period often relatively long

                long delay before resending lost packet

                Detect lost segments via duplicate ACKs

                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                fast retransmit resend segment before timer expires

                3 Transport Layer 76Comp 361 Spring 2005

                Fast retransmit algorithm

                event ACK received with ACK field value of y if (y gt SendBase)

                SendBase = yif (there are currently not-yet-acknowledged segments)

                start timer

                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                resend segment with sequence number y

                a duplicate ACK for already ACKed segment

                fast retransmit

                3 Transport Layer 77Comp 361 Spring 2005

                TCP GBN or Selective Repeat

                Basic TCP looks a lot like GBN

                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                This looks a lot like Selective Repeat

                TCP is a hybrid

                3 Transport Layer 78Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 79Comp 361 Spring 2005

                TCP Flow Control

                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                3 Transport Layer 80Comp 361 Spring 2005

                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                transmitting too muchtoo fast

                flow controlreceive side of TCP connection has a receive buffer

                speed-matching service matching the send rate to the receiving apprsquos drain rate

                app process may be slow at reading from buffer

                3 Transport Layer 81Comp 361 Spring 2005

                TCP segment structure

                source port dest port

                32 bits

                applicationdata

                (variable length)

                sequence numberacknowledgement number

                Receive windowUrg data pnterchecksum

                FSRPAUheadlen

                notused

                Options (variable length)

                URG urgent data (generally not used)

                ACK ACK valid

                PSH push data now(generally not used)

                RST SYN FINconnection estab(setup teardown

                commands)

                bytes rcvr willingto accept

                Internetchecksum

                (as in UDP)

                countingby bytes of data(not segments)

                3 Transport Layer 82Comp 361 Spring 2005

                TCP Flow control how it works

                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                = RcvWindow= RcvBuffer-[LastByteRcvd -

                LastByteRead]

                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                guarantees receive buffer doesnrsquot overflow

                3 Transport Layer 83Comp 361 Spring 2005

                Technical Issue

                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                3 Transport Layer 84Comp 361 Spring 2005

                Note on UDP

                UDP has no flow control

                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                3 Transport Layer 85Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 86Comp 361 Spring 2005

                TCP Connection Management

                Three way handshakeStep 1 client end system sends

                TCP SYN control segment to server

                specifies client_isn the initial seq No application data

                Step 2 server end system receives SYN replies with SYNACK control segment

                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                seq sbuffers flow control info (eg RcvWindow)

                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                3 Transport Layer 87Comp 361 Spring 2005

                TCP Connection Management (cont)

                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                Allocate buffersAllocates buffersCan include application data

                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                clientConnection request (SYN=1 seq=client_isn)

                server

                Connection granted (SYN=1 server_isn

                ACK (SYN=0 seq=client_isn+1)

                ack=client_isn+1)

                ack=server_isn+1

                3 Transport Layer 88Comp 361 Spring 2005

                TCP Connection Management (cont)

                Closing a connection

                client closes socketclientSocketclose()

                Step 1 client end system sends TCP FIN control segment to server

                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                client

                FIN

                server

                ACK

                ACK

                FIN

                close

                close

                closed

                tim

                ed w

                ait

                3 Transport Layer 89Comp 361 Spring 2005

                TCP Connection Management (cont)

                Step 3 client receives FIN replies with ACK

                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                Closes down after timed-wait

                Step 4 server receives ACK Connection closed

                Note with small modification can handle simultaneous FINs

                client

                FIN

                server

                ACK

                ACK

                FIN

                closing

                closing

                closed

                tim

                ed w

                ait

                closed

                3 Transport Layer 90Comp 361 Spring 2005

                TCP Connection Management (cont)

                ExampleTCP serverlifecycle

                Example TCP clientlifecycle

                3 Transport Layer 91Comp 361 Spring 2005

                A few special cases

                Have not discussed what happens if both client and server decide to close down connection at same time

                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                3 Transport Layer 92Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 93Comp 361 Spring 2005

                Principles of Congestion Control

                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                a top-10 problem

                3 Transport Layer 94Comp 361 Spring 2005

                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                large delays when congestedmaximum achievable throughput

                3 Transport Layer 95Comp 361 Spring 2005

                Causescosts of congestion scenario 2

                one router finite buffers sender retransmission of lost packet

                3 Transport Layer 96Comp 361 Spring 2005

                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                λin λout=

                λin λoutgtλ

                inλout

                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                (c)(a) (b)

                3 Transport Layer 97Comp 361 Spring 2005

                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                λin

                Q what happens as and increase λ

                in

                3 Transport Layer 98Comp 361 Spring 2005

                Causescosts of congestion scenario 3

                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                3 Transport Layer 99Comp 361 Spring 2005

                Approaches towards congestion control

                Two broad approaches towards congestion control

                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                Network-assisted congestion controlrouters provide feedback to end systems

                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                3 Transport Layer 100Comp 361 Spring 2005

                Case study ATM ABR congestion control

                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                RM cells returned to sender by receiver with bits intact

                small exception ndash see next page

                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                sender should use available bandwidth

                if senderrsquos path congested sender throttled to minimum guaranteed rate

                3 Transport Layer 101Comp 361 Spring 2005

                Case study ATM ABR congestion control

                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                3 Transport Layer 102Comp 361 Spring 2005

                Chapter 3 outline

                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                35 Connection-oriented transport TCP

                segment structurereliable data transferflow controlconnection management

                36 Principles of congestion control37 TCP congestion control

                3 Transport Layer 103Comp 361 Spring 2005

                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                Congwin

                w segments each with MSS bytes sent in one RTT

                throughput = w MSSRTT Bytessec

                3 Transport Layer 104Comp 361 Spring 2005

                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                LastByteSent-LastByteAcked le CongWin

                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                3 Transport Layer 105Comp 361 Spring 2005

                TCP AIMDmultiplicative decrease additive increase increase

                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                cut CongWin in half after loss event

                8 Kbytes

                16 Kbytes

                24 Kbytes

                time

                congestionwindow

                Long-lived TCP connection

                3 Transport Layer 106Comp 361 Spring 2005

                TCP Slow Start

                When connection begins CongWin = 1 MSS

                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                available bandwidth may be gtgt MSSRTT

                desirable to quickly ramp up to respectable rate

                When connection begins increase rate exponentially fast until first loss event

                3 Transport Layer 107Comp 361 Spring 2005

                TCP Slow Start (more)

                When connection begins increase rate exponentially until first loss event

                double CongWin every RTTdone by incrementing CongWin for every ACK received

                Summary initial rate is slow but ramps up exponentially fast

                Host A

                one segment

                RTT

                Host B

                time

                two segments

                four segments

                3 Transport Layer 108Comp 361 Spring 2005

                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                3 Transport Layer 109Comp 361 Spring 2005

                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                3 Transport Layer 110Comp 361 Spring 2005

                Summary TCP Congestion Control

                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                3 Transport Layer 111Comp 361 Spring 2005

                The Big Picture

                3 Transport Layer 112Comp 361 Spring 2005

                TCP sender congestion controlEvent State TCP Sender Action Commentary

                ACK receipt for previously unackeddata

                Slow Start (SS)

                CongWin = CongWin + MSS If (CongWin gt Threshold)

                set state to ldquoCongestion Avoidancerdquo

                Resulting in a doubling of CongWin every RTT

                ACK receipt for previously unackeddata

                CongestionAvoidance (CA)

                CongWin = CongWin+MSS (MSSCongWin)

                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                Loss event detected by triple duplicate ACK

                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                Enter slow start

                Duplicate ACK

                SS or CA Increment duplicate ACK count for segment being acked

                CongWin and Threshold not changed

                3 Transport Layer 113Comp 361 Spring 2005

                TCP throughput

                Whatrsquos the average throughput of TCP as a function of window size and RTT

                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                3 Transport Layer 114Comp 361 Spring 2005

                TCP Futures

                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                L = 210-10 WowNew versions of TCP for high-speed needed

                LRTTMSSsdot221

                3 Transport Layer 115Comp 361 Spring 2005

                TCP FairnessFairness goal if K TCP sessions share same

                bottleneck link of bandwidth R each should have average rate of RK

                TCP connection 1

                bottleneckrouter

                capacity R

                TCP connection 2

                3 Transport Layer 116Comp 361 Spring 2005

                Why is TCP fairTwo competing sessions

                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                R

                R

                equal bandwidth share

                Connection 1 throughput

                Conn

                ecti

                on 2

                thr

                ough

                p ut

                congestion avoidance additive increaseloss decrease window by factor of 2

                congestion avoidance additive increaseloss decrease window by factor of 2

                3 Transport Layer 117Comp 361 Spring 2005

                Fairness (more)Fairness and UDP

                Multimedia apps often do not use TCP

                do not want rate throttled by congestion control

                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                Current Research area How to keep UDP from congesting the internet

                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                3 Transport Layer 118Comp 361 Spring 2005

                TCP Latency ModelingNotation assumptions

                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                modeling slow start

                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                3 Transport Layer 119Comp 361 Spring 2005

                Fixed Congestion Window (W)Two cases

                1 WSR gt RTT + SR ACK for first segment in window returns before

                windowrsquos worth of data sentLatency = 2RTT + OR

                2 WSR lt RTT + SR ACK for first segment in window returns after

                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                3 Transport Layer 120Comp 361 Spring 2005

                Fixed congestion window (1)

                First caseWSR gt RTT + SR ACK for

                first segment in window returns before windowrsquos worth of data sent

                latency = 2RTT + OR

                3 Transport Layer 121Comp 361 Spring 2005

                Fixed congestion window (2)

                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                3 Transport Layer 122Comp 361 Spring 2005

                TCP Latency Modeling Slow Start (1)

                Now suppose window grows according to slow start(with no threshold and no loss events)

                Will show that the delay for one object is

                RS

                RSRTTP

                RORTTLatency P )12(2 minusminus⎥⎦

                ⎤⎢⎣⎡ +++=

                where P is the number of times TCP idles at server1min minus= KQP

                - where Q is the number of times the server idlesif the object were of infinite size

                - and K is the number of windows that cover the object

                3 Transport Layer 123Comp 361 Spring 2005

                TCP Latency Modeling Slow Start (2)

                RTT

                initiate TCPconnection

                requestobject

                first window= SR

                second window= 2SR

                third window= 4SR

                fourth window= 8SR

                completetransmissionobject

                delivered

                time atclient

                time atserver

                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                Server idles P=2 times

                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                Server idles P = minK-1Q times

                3 Transport Layer 124Comp 361 Spring 2005

                TCP Latency Modeling (3)

                ementacknowledg receivesserver until

                segment send tostartsserver whenfrom time=+ RTTRS

                RS

                RSRTTPRTT

                RO

                RSRTT

                RSRTT

                RO

                idleTimeRTTRO

                P

                kP

                k

                P

                pp

                )12(][2

                ]2[2

                2delay

                1

                1

                1

                minusminus+++=

                minus+++=

                ++=

                minus

                =

                =

                sum

                sum

                th window after the timeidle 2 1 kRSRTT

                RS k =⎥⎦

                ⎤⎢⎣⎡ minus+

                +minus

                window kth the transmit totime2 1 =minus

                RSk

                RTT

                initiate TCPconnection

                requestobject

                first window= SR

                second window= 2SR

                third window= 4SR

                fourth window= 8SR

                completetransmissionobject

                delivered

                time atclient

                time atserver

                3 Transport Layer 125Comp 361 Spring 2005

                TCP Latency Modeling (4)Recall K = number of windows that cover object

                How do we calculate K

                ⎥⎥⎤

                ⎢⎢⎡ +=

                +ge=

                geminus=

                ge+++=

                ge+++=minus

                minus

                )1(log

                )1(logmin

                12min

                222min222min

                2

                2

                110

                110

                SO

                SOkk

                SOk

                SOkOSSSkK

                k

                k

                k

                L

                L

                Calculation of Q number of idles for infinite-size objectis similar

                3 Transport Layer 126Comp 361 Spring 2005

                HTTP ModelingAssume Web page consists of

                1 base HTML page (of size O bits)M images (each of size O bits)

                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                3 Transport Layer 127Comp 361 Spring 2005

                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                02468

                101214161820

                28Kbps

                100Kbps

                1 Mbps 10Mbps

                non-persistent

                persistent

                parallel non-persistent

                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                3 Transport Layer 128Comp 361 Spring 2005

                HTTP Response time (in seconds)

                0

                10

                20

                30

                40

                50

                60

                70

                28Kbps

                100Kbps

                1 Mbps 10Mbps

                non-persistent

                persistent

                parallel non-persistent

                RTT =1 sec O = 5 Kbytes M=10 and X=5

                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                3 Transport Layer 129Comp 361 Spring 2005

                Chapter 3 Summaryprinciples behind transport layer services

                multiplexing demultiplexingreliable data transferflow controlcongestion control

                instantiation and implementation in the Internet

                UDPTCP

                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                • Chapter 3 Transport Layer last revised 160305
                • Chapter 3 outline
                • Transport services and protocols
                • Transport vs network layer
                • Transport-layer protocols
                • Chapter 3 outline
                • Multiplexingdemultiplexing
                • Multiplexingdemultiplexing
                • How demultiplexing works
                • Connectionless demultiplexing
                • Connectionless demux (cont)
                • Connection-oriented demux
                • Connection-oriented demux (cont)
                • Connection-oriented demux Threaded Web Server
                • Chapter 3 outline
                • UDP User Datagram Protocol [RFC 768]
                • UDP more
                • UDP checksum
                • Chapter 3 outline
                • Principles of Reliable data transfer
                • Reliable data transfer getting started
                • Reliable data transfer getting started
                • Incremental Improvements
                • Rdt10 reliable transfer over a reliable channel
                • Rdt20 channel with bit errors
                • rdt20 FSM specification
                • rdt20 operation with no errors
                • rdt20 error scenario
                • rdt20 has a fatal flaw
                • rdt21 sender handles garbled ACKNAKs
                • rdt21 receiver handles garbled ACKNAKs
                • rdt21 discussion
                • rdt22 a NAK-free protocol
                • rdt22 sender receiver fragments
                • rdt30 channels with errors and loss
                • rdt30 sender
                • rdt30 in action
                • rdt30 in action
                • Performance of rdt30
                • rdt30 stop-and-wait operation
                • Pipelined protocols
                • Pipelined protocols
                • Pipelining increased utilization
                • Go-Back-N
                • GBN Sender
                • GBN sender extended FSM
                • GBN receiver extended FSM
                • More on receiver
                • GBN inaction
                • Selective Repeat
                • Selective repeat sender receiver windows
                • Selective repeat
                • Selective repeat in action
                • Selective repeat dilemma
                • Chapter 3 outline
                • TCP Overview RFCs 793 1122 1323 2018 2581
                • More TCP Details
                • Even More TCP Details
                • TCP segment structure
                • TCP seq rsquos and ACKs
                • TCP Round Trip Time and Timeout
                • TCP Round Trip Time and Timeout
                • Example RTT estimation
                • TCP Round Trip Time and Timeout
                • Chapter 3 outline
                • TCP reliable data transfer
                • TCP sender events
                • TCP sender(simplified)
                • TCP retransmission scenarios
                • TCP retransmission scenarios (more)
                • TCP ACK generation [RFC 1122 RFC 2581]
                • More on Sender Policies
                • Fast Retransmit
                • Fast retransmit algorithm
                • TCP GBN or Selective Repeat
                • Chapter 3 outline
                • TCP Flow Control
                • TCP Flow Control
                • TCP segment structure
                • TCP Flow control how it works
                • Technical Issue
                • Chapter 3 outline
                • TCP Connection Management
                • TCP Connection Management (cont)
                • TCP Connection Management (cont)
                • TCP Connection Management (cont)
                • TCP Connection Management (cont)
                • A few special cases
                • Chapter 3 outline
                • Principles of Congestion Control
                • Causescosts of congestion scenario 1
                • Causescosts of congestion scenario 2
                • Causescosts of congestion scenario 3
                • Causescosts of congestion scenario 3
                • Approaches towards congestion control
                • Case study ATM ABR congestion control
                • Case study ATM ABR congestion control
                • Chapter 3 outline
                • TCP Congestion Control
                • TCP AIMD
                • TCP Slow Start
                • TCP Slow Start (more)
                • Summary TCP Congestion Control
                • The Big Picture
                • TCP sender congestion control
                • TCP throughput
                • TCP Futures
                • TCP Fairness
                • Why is TCP fair
                • Fairness (more)
                • TCP Latency Modeling
                • Fixed Congestion Window (W)
                • Fixed congestion window (1)
                • Fixed congestion window (2)
                • TCP Latency Modeling Slow Start (1)
                • TCP Latency Modeling Slow Start (2)
                • TCP Latency Modeling (3)
                • TCP Latency Modeling (4)
                • HTTP Modeling
                • Chapter 3 Summary

                  3 Transport Layer 9Comp 361 Spring 2005

                  How demultiplexing workshost receives IP datagrams

                  each datagram has source IP address destination IP addresseach datagram carries 1 transport-layer segmenteach segment has source destination port number (recall well-known port numbers for specific applications)

                  host uses IP addresses amp port numbers to direct segment to appropriate socket

                  source port dest port

                  32 bits

                  applicationdata

                  (message)

                  other header fields

                  TCPUDP segment format

                  3 Transport Layer 10Comp 361 Spring 2005

                  Connectionless demultiplexingWhen host receives UDP segment

                  checks destination port number in segmentdirects UDP segment to socket with that port number

                  IP datagrams with different source IP addresses andor source port numbers directed to same socket

                  Create sockets with port numbers

                  DatagramSocket mySocket1 = new DatagramSocket(99111)

                  DatagramSocket mySocket2 = new DatagramSocket(99222)

                  UDP socket identified by two-tuple

                  (dest IP address dest port number)

                  3 Transport Layer 11Comp 361 Spring 2005

                  Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

                  ClientIPB

                  P3

                  clientIP A

                  P1P1P3

                  serverIP C

                  SP 6428DP 9157

                  SP 9157DP 6428

                  SP 6428DP 5775

                  SP 5775DP 6428

                  SP provides ldquoreturn addressrdquo

                  3 Transport Layer 12Comp 361 Spring 2005

                  Connection-oriented demux

                  TCP socket identified by 4-tuple

                  source IP addresssource port numberdest IP addressdest port number

                  recv host uses all four values to direct segment to appropriate socket

                  Server host may support many simultaneous TCP sockets

                  each socket identified by its own 4-tuple

                  Web servers have different sockets for each connecting client

                  non-persistent HTTP will have different socket for each request

                  3 Transport Layer 13Comp 361 Spring 2005

                  Connection-oriented demux(cont)

                  ClientIPB

                  P3

                  clientIP A

                  P1P1P3

                  serverIP C

                  SP 80DP 9157

                  SP 9157DP 80

                  SP 80DP 5775

                  SP 5775DP 80

                  P4

                  3 Transport Layer 14Comp 361 Spring 2005

                  Connection-oriented demux Threaded Web Server

                  ClientIPB

                  P1

                  clientIP A

                  P1P2

                  serverIP C

                  SP 9157DP 80

                  SP 9157DP 80

                  P4 P3

                  D-IPCS-IP AD-IPC

                  S-IP B

                  SP 5775DP 80

                  D-IPCS-IP B

                  3 Transport Layer 15Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 16Comp 361 Spring 2005

                  UDP User Datagram Protocol [RFC 768]

                  ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                  lostdelivered out of order to app

                  connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                  Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                  3 Transport Layer 17Comp 361 Spring 2005

                  UDP moreoften used for streaming multimedia apps

                  loss tolerantrate sensitive

                  other UDP uses (why)

                  DNS small delaySNMP stressful cond

                  reliable transfer over UDP add reliability at application layer

                  application-specific error recover

                  source port dest port

                  32 bits

                  Applicationdata

                  (message)

                  length checksumLength in

                  bytes of UDPsegmentincluding

                  header

                  UDP segment format

                  3 Transport Layer 18Comp 361 Spring 2005

                  UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                  segment

                  Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                  NO - error detectedYES - no error detected But maybe errors nonetheless More later

                  Receiver may choose to discard segment or send a warning to app in case error

                  Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                  3 Transport Layer 19Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 20Comp 361 Spring 2005

                  Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                  characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                  3 Transport Layer 21Comp 361 Spring 2005

                  Reliable data transfer getting started

                  sendside

                  receiveside

                  rdt_send() called from above (eg by app) Passed data to

                  deliver to receiver upper layer

                  udt_send() called by rdtto transfer packet over

                  unreliable channel to receiver

                  rdt_rcv() called when packet arrives on rcv-side of channel

                  deliver_data() called by rdt to deliver data to upper

                  3 Transport Layer 22Comp 361 Spring 2005

                  Reliable data transfer getting startedWersquoll

                  incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                  but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                  state1

                  state2

                  event causing state transitionactions taken on state transition

                  state when in this ldquostaterdquo next state

                  uniquely determined by next event

                  eventactions

                  3 Transport Layer 23Comp 361 Spring 2005

                  Incremental Improvements

                  rdt10 assumes every packet sent arrives and no errors introduced in transmission

                  rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                  rdt21 deals with corrupted ACKSNAKS

                  rdt22 like rdt21 but does not need NAKs

                  Rdt30 Allows packets to be lost

                  Rdt10 reliable transfer over a reliable channel

                  underlying channel perfectly reliableno bit errorsno loss of packets

                  separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                  Wait for call from above packet = make_pkt(data)

                  udt_send(packet)

                  rdt_send(data)extract (packetdata)deliver_data(data)

                  Wait for call from

                  below

                  rdt_rcv(packet)

                  sender receiver

                  3 Transport Layer 24Comp 361 Spring 2005

                  3 Transport Layer 25Comp 361 Spring 2005

                  Rdt20 channel with bit errors

                  underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                  the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                  new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                  3 Transport Layer 26Comp 361 Spring 2005

                  rdt20 FSM specification

                  Wait for call from above

                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                  udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                  udt_send(NAK)

                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                  Wait for ACK or

                  NAK

                  rdt_send(data)

                  receiver

                  Wait for call from

                  below

                  Λ

                  sender

                  3 Transport Layer 27Comp 361 Spring 2005

                  rdt20 operation with no errors

                  Wait for call from above

                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                  udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                  udt_send(NAK)

                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                  Wait for ACK or

                  NAK

                  Wait for call from

                  below

                  rdt_send(data)

                  Λ

                  3 Transport Layer 28Comp 361 Spring 2005

                  rdt20 error scenario

                  Wait for call from above

                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                  udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                  udt_send(NAK)

                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                  Wait for ACK or

                  NAK

                  Wait for call from

                  below

                  rdt_send(data)

                  Λ

                  3 Transport Layer 29Comp 361 Spring 2005

                  rdt20 has a fatal flawWhat happens if ACKNAK

                  corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                  What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                  Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                  Sender sends one packet then waits for receiver response

                  stop and wait

                  3 Transport Layer 30Comp 361 Spring 2005

                  Sender whenever sender receives control message it sends a packet to receiver

                  A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                  Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                  Note ACKNAK do not contain sequence

                  3 Transport Layer 31Comp 361 Spring 2005

                  rdt21 sender handles garbled ACKNAKs

                  Wait for call 0 from

                  above

                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                  rdt_send(data)

                  Wait for ACK or NAK 0 udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                  rdt_send(data)

                  udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                  Wait forcall 1 from

                  above

                  Wait for ACK or NAK 1

                  ΛΛ

                  3 Transport Layer 32Comp 361 Spring 2005

                  rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                  ampamp has_seq0(rcvpkt)

                  Wait for 0 from below

                  sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                  Wait for 1 from below

                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                  sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                  sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                  sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                  3 Transport Layer 33Comp 361 Spring 2005

                  rdt21 discussion

                  Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                  state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                  Receivermust check if received packet is duplicate

                  state indicates whether 0 or 1 is expected pkt seq

                  note receiver can notknow if its last ACKNAK received OK at sender

                  3 Transport Layer 34Comp 361 Spring 2005

                  rdt22 a NAK-free protocol

                  same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                  receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                  duplicate ACK at sender results in same action as NAK retransmit current pkt

                  3 Transport Layer 35Comp 361 Spring 2005

                  rdt22 sender receiver fragments

                  Wait for call 0 from

                  above

                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                  rdt_send(data)

                  udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                  isACK(rcvpkt1) )

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                  Wait for ACK

                  0sender FSM

                  fragment

                  Wait for 0 from below

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                  has_seq1(rcvpkt))

                  udt_send(sndpkt)receiver FSM

                  fragment

                  Λ

                  3 Transport Layer 36Comp 361 Spring 2005

                  rdt30 channels with errors and loss

                  New assumptionunderlying channel can also lose packets (data or ACKs)

                  checksum seq ACKs retransmissions will be of help but not enough

                  Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                  Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                  retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                  requires countdown timer

                  3 Transport Layer 37Comp 361 Spring 2005

                  rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                  rdt_send(data)

                  Wait for

                  ACK0

                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                  Wait for call 1 from

                  above

                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                  rdt_send(data)

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                  stop_timerstop_timer

                  udt_send(sndpkt)start_timer

                  timeout

                  udt_send(sndpkt)start_timer

                  timeout

                  rdt_rcv(rcvpkt)

                  Wait for call 0from

                  above

                  Wait for

                  ACK1

                  Λrdt_rcv(rcvpkt)

                  ΛΛ

                  Λ

                  3 Transport Layer 38Comp 361 Spring 2005

                  rdt30 in action

                  3 Transport Layer 39Comp 361 Spring 2005

                  rdt30 in action

                  3 Transport Layer 40Comp 361 Spring 2005

                  Performance of rdt30

                  rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                  L (packet length in bits)R (transmission rate bps)

                  8kbpkt109 bsec

                  Ttransmit = = = 8 microsec

                  U sender =

                  00830008

                  = 000027 L R RTT + L R

                  =

                  U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                  rdt30 stop-and-wait operation

                  first packet bit transmitted t = 0

                  sender receiver

                  RTT

                  last packet bit transmitted t = L R

                  first packet bit arriveslast packet bit arrives send ACK

                  ACK arrives send next packet t = RTT + L R

                  U sender =

                  008 30008

                  = 000027 L R RTT + L R

                  =

                  3 Transport Layer 41Comp 361 Spring 2005

                  3 Transport Layer 42Comp 361 Spring 2005

                  Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                  range of sequence numbers must be increasedbuffering at sender andor receiver

                  3 Transport Layer 43Comp 361 Spring 2005

                  Pipelined protocols

                  Advantage much better bandwidth utilization than stop-and-wait

                  Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                  Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                  Note TCP is not exactly either

                  Pipelining increased utilization

                  first packet bit transmitted t = 0

                  sender receiver

                  RTT

                  last bit transmitted t = L R

                  first packet bit arriveslast packet bit arrives send ACK

                  ACK arrives send next packet t = RTT + L R

                  last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                  U sender =

                  02430008

                  = 00008 3 L R RTT + L R

                  =

                  Increase utilizationby a factor of 3

                  3 Transport Layer 44Comp 361 Spring 2005

                  3 Transport Layer 45Comp 361 Spring 2005

                  Go-Back-NSender

                  k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                  ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                  Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                  3 Transport Layer 46Comp 361 Spring 2005

                  GBN Sender

                  rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                  Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                  Timeout resends ALL packets that have been sent but not yet acknowledged

                  This is only event that triggers resend

                  3 Transport Layer 47Comp 361 Spring 2005

                  GBN sender extended FSMrdt_send(data)

                  Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                  timeout

                  if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                  start_timernextseqnum++

                  elserefuse_data(data)

                  base = getacknum(rcvpkt)+1If (base == nextseqnum)

                  stop_timerelse

                  start_timer

                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                  base=1nextseqnum=1

                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                  Λ

                  3 Transport Layer 48Comp 361 Spring 2005

                  GBN receiver extended FSM

                  Wait

                  udt_send(sndpkt)default

                  rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                  expectedseqnum=1sndpkt =

                  make_pkt(0ACKchksum)

                  Λ

                  If expected packet receivedSend ACK and deliver packet upstairs

                  If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                  3 Transport Layer 49Comp 361 Spring 2005

                  More on receiver

                  The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                  3 Transport Layer 50Comp 361 Spring 2005

                  GBN inaction

                  GBN is easy to code but might have performance problems

                  In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                  Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                  3 Transport Layer 51Comp 361 Spring 2005

                  3 Transport Layer 52Comp 361 Spring 2005

                  Selective Repeat

                  receiver individually acknowledges all correctly received pkts

                  buffers pkts as needed for eventual in-order delivery to upper layer

                  sender only resends pkts for which ACK not received

                  sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                  sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                  3 Transport Layer 53Comp 361 Spring 2005

                  Selective repeat sender receiver windows

                  3 Transport Layer 54Comp 361 Spring 2005

                  Selective repeat

                  pkt n in [rcvbase rcvbase+N-1]

                  send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                  pkt n in [rcvbase-Nrcvbase-1]

                  ACK(n) (note this is a reACK)

                  otherwiseignore

                  receiverdata from above

                  if next available seq in window send pkt

                  timeout(n)resend pkt n restart timer

                  ACK(n) in [sendbasesendbase+N]

                  mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                  sender

                  3 Transport Layer 55Comp 361 Spring 2005

                  Selective repeat in action

                  3 Transport Layer 56Comp 361 Spring 2005

                  Selective repeatdilemma

                  Example seq rsquos 0 1 2 3window size=3

                  receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                  Q what is relationship between seq size and window size

                  3 Transport Layer 57Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 58Comp 361 Spring 2005

                  TCP Overview RFCs 793 1122 1323 2018 2581

                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                  flow controlledsender will not overwhelm receiver

                  point-to-pointone sender one receiver

                  reliable in-order byte steam

                  no ldquomessage boundariesrdquopipelined

                  TCP congestion and flow control set window size

                  send amp receive buffers

                  socketdoor

                  TCPsend buffer

                  TCPreceive buffer

                  socketdoor

                  segment

                  applicationwrites data

                  applicationreads data

                  3 Transport Layer 59Comp 361 Spring 2005

                  More TCP DetailsMaximum Segment Size (MSS)

                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                  Application Data + TCP Header = TCP Segment

                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                  (again no payload)Client responds with third special segment

                  This can contain payload

                  3 Transport Layer 60Comp 361 Spring 2005

                  Even More TCP Details

                  A TCP connection between client and server creates in both client and server

                  (i) buffers(ii) variables and

                  (iii) a socket connection to process

                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                  any of the network elements between the host and server

                  3 Transport Layer 61Comp 361 Spring 2005

                  TCP segment structure

                  source port dest port

                  32 bits

                  applicationdata

                  (variable length)

                  sequence numberacknowledgement number

                  Receive windowUrg data pnterchecksum

                  FSRPAUheadlen

                  notused

                  Options (variable length)

                  URG urgent data (generally not used)

                  ACK ACK valid

                  PSH push data now(generally not used)

                  RST SYN FINconnection estab(setup teardown

                  commands)

                  bytes rcvr willingto accept

                  Internetchecksum

                  (as in UDP)

                  countingby bytes of data(not segments)

                  3 Transport Layer 62Comp 361 Spring 2005

                  TCP seq rsquos and ACKsSeq rsquos

                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                  ACKsseq of next byte expected from other sidecumulative ACK

                  Q how receiver handles out-of-order segments

                  A TCP spec doesnrsquot say - up to implementer

                  Host BHost A

                  Seq=42 ACK=79 data = lsquoCrsquo

                  Seq=79 ACK=43 data = lsquoCrsquo

                  Seq=43 ACK=80

                  Usertypes

                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                  back lsquoCrsquo

                  host ACKsreceipt

                  of echoedlsquoCrsquo

                  timesimple telnet scenario

                  3 Transport Layer 63Comp 361 Spring 2005

                  TCP Round Trip Time and Timeout

                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                  average several recent measurements not just current SampleRTT

                  Q how to set TCP timeout valuelonger than RTT

                  but RTT variestoo short premature timeout

                  unnecessary retransmissions

                  too long slow reaction to segment loss

                  3 Transport Layer 64Comp 361 Spring 2005

                  TCP Round Trip Time and Timeout

                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                  3 Transport Layer 65Comp 361 Spring 2005

                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                  100

                  150

                  200

                  250

                  300

                  350

                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                  time (seconnds)

                  RTT

                  (mill

                  iseco

                  nds)

                  SampleRTT Estimated RTT

                  3 Transport Layer 66Comp 361 Spring 2005

                  TCP Round Trip Time and Timeout

                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                  (typically β = 025)

                  Then set timeout interval

                  TimeoutInterval = EstimatedRTT + 4DevRTT

                  3 Transport Layer 67Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 68Comp 361 Spring 2005

                  TCP reliable data transfer

                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                  Retransmissions are triggered by

                  timeout eventsduplicate acks

                  Initially consider simplified TCP sender

                  ignore duplicate acksignore flow control congestion control

                  3 Transport Layer 69Comp 361 Spring 2005

                  TCP sender eventsdata rcvd from app

                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                  timeoutretransmit segment that caused timeoutrestart timer

                  Ack rcvdIf acknowledges previously unackedsegments

                  update what is known to be ackedstart timer if there are outstanding segments

                  TCP sender(simplified)

                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                  loop (forever) switch(event)

                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                  event timer timeoutretransmit not-yet-acknowledged segment with

                  smallest sequence numberstart timer

                  event ACK received with ACK field value of y if (y gt SendBase)

                  SendBase = yif (there are currently not-yet-acknowledged segments)

                  start timer

                  end of loop forever

                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                  3 Transport Layer 70Comp 361 Spring 2005

                  3 Transport Layer 71Comp 361 Spring 2005

                  TCP retransmission scenariosHost A

                  Seq=100 20 bytes data

                  ACK=100

                  timepremature timeout

                  Host B

                  Seq=92 8 bytes data

                  ACK=120

                  Seq=92 8 bytes data

                  Seq=

                  92 t

                  imeo

                  ut

                  ACK=120

                  Host A

                  Seq=92 8 bytes data

                  ACK=100

                  loss

                  tim

                  eout

                  lost ACK scenario

                  Host B

                  X

                  Seq=92 8 bytes data

                  ACK=100

                  time

                  SendBase= 120

                  SendBase= 120

                  Sendbase= 100

                  Seq=

                  92 t

                  imeo

                  utSendBase

                  = 100

                  3 Transport Layer 72Comp 361 Spring 2005

                  TCP retransmission scenarios (more)Host A

                  Seq=92 8 bytes data

                  ACK=100

                  loss

                  tim

                  eout

                  Cumulative ACK scenario

                  Host B

                  X

                  Seq=100 20 bytes data

                  ACK=120

                  time

                  SendBase= 120

                  3 Transport Layer 73Comp 361 Spring 2005

                  TCP ACK generation [RFC 1122 RFC 2581]

                  Event at Receiver

                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                  Arrival of segment that partially or completely fills gap

                  TCP Receiver action

                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                  Immediately send single cumulative ACK ACKing both in-order segments

                  Immediately send duplicate ACK indicating seq of next expected byte

                  Immediate send ACK provided thatsegment starts at lower end of gap

                  3 Transport Layer 74Comp 361 Spring 2005

                  More on Sender Policies

                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                  3 Transport Layer 75Comp 361 Spring 2005

                  Fast Retransmit

                  Time-out period often relatively long

                  long delay before resending lost packet

                  Detect lost segments via duplicate ACKs

                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                  fast retransmit resend segment before timer expires

                  3 Transport Layer 76Comp 361 Spring 2005

                  Fast retransmit algorithm

                  event ACK received with ACK field value of y if (y gt SendBase)

                  SendBase = yif (there are currently not-yet-acknowledged segments)

                  start timer

                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                  resend segment with sequence number y

                  a duplicate ACK for already ACKed segment

                  fast retransmit

                  3 Transport Layer 77Comp 361 Spring 2005

                  TCP GBN or Selective Repeat

                  Basic TCP looks a lot like GBN

                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                  This looks a lot like Selective Repeat

                  TCP is a hybrid

                  3 Transport Layer 78Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 79Comp 361 Spring 2005

                  TCP Flow Control

                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                  3 Transport Layer 80Comp 361 Spring 2005

                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                  transmitting too muchtoo fast

                  flow controlreceive side of TCP connection has a receive buffer

                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                  app process may be slow at reading from buffer

                  3 Transport Layer 81Comp 361 Spring 2005

                  TCP segment structure

                  source port dest port

                  32 bits

                  applicationdata

                  (variable length)

                  sequence numberacknowledgement number

                  Receive windowUrg data pnterchecksum

                  FSRPAUheadlen

                  notused

                  Options (variable length)

                  URG urgent data (generally not used)

                  ACK ACK valid

                  PSH push data now(generally not used)

                  RST SYN FINconnection estab(setup teardown

                  commands)

                  bytes rcvr willingto accept

                  Internetchecksum

                  (as in UDP)

                  countingby bytes of data(not segments)

                  3 Transport Layer 82Comp 361 Spring 2005

                  TCP Flow control how it works

                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                  LastByteRead]

                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                  guarantees receive buffer doesnrsquot overflow

                  3 Transport Layer 83Comp 361 Spring 2005

                  Technical Issue

                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                  3 Transport Layer 84Comp 361 Spring 2005

                  Note on UDP

                  UDP has no flow control

                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                  3 Transport Layer 85Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 86Comp 361 Spring 2005

                  TCP Connection Management

                  Three way handshakeStep 1 client end system sends

                  TCP SYN control segment to server

                  specifies client_isn the initial seq No application data

                  Step 2 server end system receives SYN replies with SYNACK control segment

                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                  seq sbuffers flow control info (eg RcvWindow)

                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                  3 Transport Layer 87Comp 361 Spring 2005

                  TCP Connection Management (cont)

                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                  Allocate buffersAllocates buffersCan include application data

                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                  clientConnection request (SYN=1 seq=client_isn)

                  server

                  Connection granted (SYN=1 server_isn

                  ACK (SYN=0 seq=client_isn+1)

                  ack=client_isn+1)

                  ack=server_isn+1

                  3 Transport Layer 88Comp 361 Spring 2005

                  TCP Connection Management (cont)

                  Closing a connection

                  client closes socketclientSocketclose()

                  Step 1 client end system sends TCP FIN control segment to server

                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                  client

                  FIN

                  server

                  ACK

                  ACK

                  FIN

                  close

                  close

                  closed

                  tim

                  ed w

                  ait

                  3 Transport Layer 89Comp 361 Spring 2005

                  TCP Connection Management (cont)

                  Step 3 client receives FIN replies with ACK

                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                  Closes down after timed-wait

                  Step 4 server receives ACK Connection closed

                  Note with small modification can handle simultaneous FINs

                  client

                  FIN

                  server

                  ACK

                  ACK

                  FIN

                  closing

                  closing

                  closed

                  tim

                  ed w

                  ait

                  closed

                  3 Transport Layer 90Comp 361 Spring 2005

                  TCP Connection Management (cont)

                  ExampleTCP serverlifecycle

                  Example TCP clientlifecycle

                  3 Transport Layer 91Comp 361 Spring 2005

                  A few special cases

                  Have not discussed what happens if both client and server decide to close down connection at same time

                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                  3 Transport Layer 92Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 93Comp 361 Spring 2005

                  Principles of Congestion Control

                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                  a top-10 problem

                  3 Transport Layer 94Comp 361 Spring 2005

                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                  large delays when congestedmaximum achievable throughput

                  3 Transport Layer 95Comp 361 Spring 2005

                  Causescosts of congestion scenario 2

                  one router finite buffers sender retransmission of lost packet

                  3 Transport Layer 96Comp 361 Spring 2005

                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                  λin λout=

                  λin λoutgtλ

                  inλout

                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                  (c)(a) (b)

                  3 Transport Layer 97Comp 361 Spring 2005

                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                  λin

                  Q what happens as and increase λ

                  in

                  3 Transport Layer 98Comp 361 Spring 2005

                  Causescosts of congestion scenario 3

                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                  3 Transport Layer 99Comp 361 Spring 2005

                  Approaches towards congestion control

                  Two broad approaches towards congestion control

                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                  Network-assisted congestion controlrouters provide feedback to end systems

                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                  3 Transport Layer 100Comp 361 Spring 2005

                  Case study ATM ABR congestion control

                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                  RM cells returned to sender by receiver with bits intact

                  small exception ndash see next page

                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                  sender should use available bandwidth

                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                  3 Transport Layer 101Comp 361 Spring 2005

                  Case study ATM ABR congestion control

                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                  3 Transport Layer 102Comp 361 Spring 2005

                  Chapter 3 outline

                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                  35 Connection-oriented transport TCP

                  segment structurereliable data transferflow controlconnection management

                  36 Principles of congestion control37 TCP congestion control

                  3 Transport Layer 103Comp 361 Spring 2005

                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                  Congwin

                  w segments each with MSS bytes sent in one RTT

                  throughput = w MSSRTT Bytessec

                  3 Transport Layer 104Comp 361 Spring 2005

                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                  LastByteSent-LastByteAcked le CongWin

                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                  3 Transport Layer 105Comp 361 Spring 2005

                  TCP AIMDmultiplicative decrease additive increase increase

                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                  cut CongWin in half after loss event

                  8 Kbytes

                  16 Kbytes

                  24 Kbytes

                  time

                  congestionwindow

                  Long-lived TCP connection

                  3 Transport Layer 106Comp 361 Spring 2005

                  TCP Slow Start

                  When connection begins CongWin = 1 MSS

                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                  available bandwidth may be gtgt MSSRTT

                  desirable to quickly ramp up to respectable rate

                  When connection begins increase rate exponentially fast until first loss event

                  3 Transport Layer 107Comp 361 Spring 2005

                  TCP Slow Start (more)

                  When connection begins increase rate exponentially until first loss event

                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                  Summary initial rate is slow but ramps up exponentially fast

                  Host A

                  one segment

                  RTT

                  Host B

                  time

                  two segments

                  four segments

                  3 Transport Layer 108Comp 361 Spring 2005

                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                  3 Transport Layer 109Comp 361 Spring 2005

                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                  3 Transport Layer 110Comp 361 Spring 2005

                  Summary TCP Congestion Control

                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                  3 Transport Layer 111Comp 361 Spring 2005

                  The Big Picture

                  3 Transport Layer 112Comp 361 Spring 2005

                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                  ACK receipt for previously unackeddata

                  Slow Start (SS)

                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                  set state to ldquoCongestion Avoidancerdquo

                  Resulting in a doubling of CongWin every RTT

                  ACK receipt for previously unackeddata

                  CongestionAvoidance (CA)

                  CongWin = CongWin+MSS (MSSCongWin)

                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                  Loss event detected by triple duplicate ACK

                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                  Enter slow start

                  Duplicate ACK

                  SS or CA Increment duplicate ACK count for segment being acked

                  CongWin and Threshold not changed

                  3 Transport Layer 113Comp 361 Spring 2005

                  TCP throughput

                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                  3 Transport Layer 114Comp 361 Spring 2005

                  TCP Futures

                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                  L = 210-10 WowNew versions of TCP for high-speed needed

                  LRTTMSSsdot221

                  3 Transport Layer 115Comp 361 Spring 2005

                  TCP FairnessFairness goal if K TCP sessions share same

                  bottleneck link of bandwidth R each should have average rate of RK

                  TCP connection 1

                  bottleneckrouter

                  capacity R

                  TCP connection 2

                  3 Transport Layer 116Comp 361 Spring 2005

                  Why is TCP fairTwo competing sessions

                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                  R

                  R

                  equal bandwidth share

                  Connection 1 throughput

                  Conn

                  ecti

                  on 2

                  thr

                  ough

                  p ut

                  congestion avoidance additive increaseloss decrease window by factor of 2

                  congestion avoidance additive increaseloss decrease window by factor of 2

                  3 Transport Layer 117Comp 361 Spring 2005

                  Fairness (more)Fairness and UDP

                  Multimedia apps often do not use TCP

                  do not want rate throttled by congestion control

                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                  Current Research area How to keep UDP from congesting the internet

                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                  3 Transport Layer 118Comp 361 Spring 2005

                  TCP Latency ModelingNotation assumptions

                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                  modeling slow start

                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                  3 Transport Layer 119Comp 361 Spring 2005

                  Fixed Congestion Window (W)Two cases

                  1 WSR gt RTT + SR ACK for first segment in window returns before

                  windowrsquos worth of data sentLatency = 2RTT + OR

                  2 WSR lt RTT + SR ACK for first segment in window returns after

                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                  3 Transport Layer 120Comp 361 Spring 2005

                  Fixed congestion window (1)

                  First caseWSR gt RTT + SR ACK for

                  first segment in window returns before windowrsquos worth of data sent

                  latency = 2RTT + OR

                  3 Transport Layer 121Comp 361 Spring 2005

                  Fixed congestion window (2)

                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                  3 Transport Layer 122Comp 361 Spring 2005

                  TCP Latency Modeling Slow Start (1)

                  Now suppose window grows according to slow start(with no threshold and no loss events)

                  Will show that the delay for one object is

                  RS

                  RSRTTP

                  RORTTLatency P )12(2 minusminus⎥⎦

                  ⎤⎢⎣⎡ +++=

                  where P is the number of times TCP idles at server1min minus= KQP

                  - where Q is the number of times the server idlesif the object were of infinite size

                  - and K is the number of windows that cover the object

                  3 Transport Layer 123Comp 361 Spring 2005

                  TCP Latency Modeling Slow Start (2)

                  RTT

                  initiate TCPconnection

                  requestobject

                  first window= SR

                  second window= 2SR

                  third window= 4SR

                  fourth window= 8SR

                  completetransmissionobject

                  delivered

                  time atclient

                  time atserver

                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                  Server idles P=2 times

                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                  Server idles P = minK-1Q times

                  3 Transport Layer 124Comp 361 Spring 2005

                  TCP Latency Modeling (3)

                  ementacknowledg receivesserver until

                  segment send tostartsserver whenfrom time=+ RTTRS

                  RS

                  RSRTTPRTT

                  RO

                  RSRTT

                  RSRTT

                  RO

                  idleTimeRTTRO

                  P

                  kP

                  k

                  P

                  pp

                  )12(][2

                  ]2[2

                  2delay

                  1

                  1

                  1

                  minusminus+++=

                  minus+++=

                  ++=

                  minus

                  =

                  =

                  sum

                  sum

                  th window after the timeidle 2 1 kRSRTT

                  RS k =⎥⎦

                  ⎤⎢⎣⎡ minus+

                  +minus

                  window kth the transmit totime2 1 =minus

                  RSk

                  RTT

                  initiate TCPconnection

                  requestobject

                  first window= SR

                  second window= 2SR

                  third window= 4SR

                  fourth window= 8SR

                  completetransmissionobject

                  delivered

                  time atclient

                  time atserver

                  3 Transport Layer 125Comp 361 Spring 2005

                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                  How do we calculate K

                  ⎥⎥⎤

                  ⎢⎢⎡ +=

                  +ge=

                  geminus=

                  ge+++=

                  ge+++=minus

                  minus

                  )1(log

                  )1(logmin

                  12min

                  222min222min

                  2

                  2

                  110

                  110

                  SO

                  SOkk

                  SOk

                  SOkOSSSkK

                  k

                  k

                  k

                  L

                  L

                  Calculation of Q number of idles for infinite-size objectis similar

                  3 Transport Layer 126Comp 361 Spring 2005

                  HTTP ModelingAssume Web page consists of

                  1 base HTML page (of size O bits)M images (each of size O bits)

                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                  3 Transport Layer 127Comp 361 Spring 2005

                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                  02468

                  101214161820

                  28Kbps

                  100Kbps

                  1 Mbps 10Mbps

                  non-persistent

                  persistent

                  parallel non-persistent

                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                  3 Transport Layer 128Comp 361 Spring 2005

                  HTTP Response time (in seconds)

                  0

                  10

                  20

                  30

                  40

                  50

                  60

                  70

                  28Kbps

                  100Kbps

                  1 Mbps 10Mbps

                  non-persistent

                  persistent

                  parallel non-persistent

                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                  3 Transport Layer 129Comp 361 Spring 2005

                  Chapter 3 Summaryprinciples behind transport layer services

                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                  instantiation and implementation in the Internet

                  UDPTCP

                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                  • Chapter 3 Transport Layer last revised 160305
                  • Chapter 3 outline
                  • Transport services and protocols
                  • Transport vs network layer
                  • Transport-layer protocols
                  • Chapter 3 outline
                  • Multiplexingdemultiplexing
                  • Multiplexingdemultiplexing
                  • How demultiplexing works
                  • Connectionless demultiplexing
                  • Connectionless demux (cont)
                  • Connection-oriented demux
                  • Connection-oriented demux (cont)
                  • Connection-oriented demux Threaded Web Server
                  • Chapter 3 outline
                  • UDP User Datagram Protocol [RFC 768]
                  • UDP more
                  • UDP checksum
                  • Chapter 3 outline
                  • Principles of Reliable data transfer
                  • Reliable data transfer getting started
                  • Reliable data transfer getting started
                  • Incremental Improvements
                  • Rdt10 reliable transfer over a reliable channel
                  • Rdt20 channel with bit errors
                  • rdt20 FSM specification
                  • rdt20 operation with no errors
                  • rdt20 error scenario
                  • rdt20 has a fatal flaw
                  • rdt21 sender handles garbled ACKNAKs
                  • rdt21 receiver handles garbled ACKNAKs
                  • rdt21 discussion
                  • rdt22 a NAK-free protocol
                  • rdt22 sender receiver fragments
                  • rdt30 channels with errors and loss
                  • rdt30 sender
                  • rdt30 in action
                  • rdt30 in action
                  • Performance of rdt30
                  • rdt30 stop-and-wait operation
                  • Pipelined protocols
                  • Pipelined protocols
                  • Pipelining increased utilization
                  • Go-Back-N
                  • GBN Sender
                  • GBN sender extended FSM
                  • GBN receiver extended FSM
                  • More on receiver
                  • GBN inaction
                  • Selective Repeat
                  • Selective repeat sender receiver windows
                  • Selective repeat
                  • Selective repeat in action
                  • Selective repeat dilemma
                  • Chapter 3 outline
                  • TCP Overview RFCs 793 1122 1323 2018 2581
                  • More TCP Details
                  • Even More TCP Details
                  • TCP segment structure
                  • TCP seq rsquos and ACKs
                  • TCP Round Trip Time and Timeout
                  • TCP Round Trip Time and Timeout
                  • Example RTT estimation
                  • TCP Round Trip Time and Timeout
                  • Chapter 3 outline
                  • TCP reliable data transfer
                  • TCP sender events
                  • TCP sender(simplified)
                  • TCP retransmission scenarios
                  • TCP retransmission scenarios (more)
                  • TCP ACK generation [RFC 1122 RFC 2581]
                  • More on Sender Policies
                  • Fast Retransmit
                  • Fast retransmit algorithm
                  • TCP GBN or Selective Repeat
                  • Chapter 3 outline
                  • TCP Flow Control
                  • TCP Flow Control
                  • TCP segment structure
                  • TCP Flow control how it works
                  • Technical Issue
                  • Chapter 3 outline
                  • TCP Connection Management
                  • TCP Connection Management (cont)
                  • TCP Connection Management (cont)
                  • TCP Connection Management (cont)
                  • TCP Connection Management (cont)
                  • A few special cases
                  • Chapter 3 outline
                  • Principles of Congestion Control
                  • Causescosts of congestion scenario 1
                  • Causescosts of congestion scenario 2
                  • Causescosts of congestion scenario 3
                  • Causescosts of congestion scenario 3
                  • Approaches towards congestion control
                  • Case study ATM ABR congestion control
                  • Case study ATM ABR congestion control
                  • Chapter 3 outline
                  • TCP Congestion Control
                  • TCP AIMD
                  • TCP Slow Start
                  • TCP Slow Start (more)
                  • Summary TCP Congestion Control
                  • The Big Picture
                  • TCP sender congestion control
                  • TCP throughput
                  • TCP Futures
                  • TCP Fairness
                  • Why is TCP fair
                  • Fairness (more)
                  • TCP Latency Modeling
                  • Fixed Congestion Window (W)
                  • Fixed congestion window (1)
                  • Fixed congestion window (2)
                  • TCP Latency Modeling Slow Start (1)
                  • TCP Latency Modeling Slow Start (2)
                  • TCP Latency Modeling (3)
                  • TCP Latency Modeling (4)
                  • HTTP Modeling
                  • Chapter 3 Summary

                    3 Transport Layer 10Comp 361 Spring 2005

                    Connectionless demultiplexingWhen host receives UDP segment

                    checks destination port number in segmentdirects UDP segment to socket with that port number

                    IP datagrams with different source IP addresses andor source port numbers directed to same socket

                    Create sockets with port numbers

                    DatagramSocket mySocket1 = new DatagramSocket(99111)

                    DatagramSocket mySocket2 = new DatagramSocket(99222)

                    UDP socket identified by two-tuple

                    (dest IP address dest port number)

                    3 Transport Layer 11Comp 361 Spring 2005

                    Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

                    ClientIPB

                    P3

                    clientIP A

                    P1P1P3

                    serverIP C

                    SP 6428DP 9157

                    SP 9157DP 6428

                    SP 6428DP 5775

                    SP 5775DP 6428

                    SP provides ldquoreturn addressrdquo

                    3 Transport Layer 12Comp 361 Spring 2005

                    Connection-oriented demux

                    TCP socket identified by 4-tuple

                    source IP addresssource port numberdest IP addressdest port number

                    recv host uses all four values to direct segment to appropriate socket

                    Server host may support many simultaneous TCP sockets

                    each socket identified by its own 4-tuple

                    Web servers have different sockets for each connecting client

                    non-persistent HTTP will have different socket for each request

                    3 Transport Layer 13Comp 361 Spring 2005

                    Connection-oriented demux(cont)

                    ClientIPB

                    P3

                    clientIP A

                    P1P1P3

                    serverIP C

                    SP 80DP 9157

                    SP 9157DP 80

                    SP 80DP 5775

                    SP 5775DP 80

                    P4

                    3 Transport Layer 14Comp 361 Spring 2005

                    Connection-oriented demux Threaded Web Server

                    ClientIPB

                    P1

                    clientIP A

                    P1P2

                    serverIP C

                    SP 9157DP 80

                    SP 9157DP 80

                    P4 P3

                    D-IPCS-IP AD-IPC

                    S-IP B

                    SP 5775DP 80

                    D-IPCS-IP B

                    3 Transport Layer 15Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 16Comp 361 Spring 2005

                    UDP User Datagram Protocol [RFC 768]

                    ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                    lostdelivered out of order to app

                    connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                    Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                    3 Transport Layer 17Comp 361 Spring 2005

                    UDP moreoften used for streaming multimedia apps

                    loss tolerantrate sensitive

                    other UDP uses (why)

                    DNS small delaySNMP stressful cond

                    reliable transfer over UDP add reliability at application layer

                    application-specific error recover

                    source port dest port

                    32 bits

                    Applicationdata

                    (message)

                    length checksumLength in

                    bytes of UDPsegmentincluding

                    header

                    UDP segment format

                    3 Transport Layer 18Comp 361 Spring 2005

                    UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                    segment

                    Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                    NO - error detectedYES - no error detected But maybe errors nonetheless More later

                    Receiver may choose to discard segment or send a warning to app in case error

                    Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                    3 Transport Layer 19Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 20Comp 361 Spring 2005

                    Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                    characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                    3 Transport Layer 21Comp 361 Spring 2005

                    Reliable data transfer getting started

                    sendside

                    receiveside

                    rdt_send() called from above (eg by app) Passed data to

                    deliver to receiver upper layer

                    udt_send() called by rdtto transfer packet over

                    unreliable channel to receiver

                    rdt_rcv() called when packet arrives on rcv-side of channel

                    deliver_data() called by rdt to deliver data to upper

                    3 Transport Layer 22Comp 361 Spring 2005

                    Reliable data transfer getting startedWersquoll

                    incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                    but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                    state1

                    state2

                    event causing state transitionactions taken on state transition

                    state when in this ldquostaterdquo next state

                    uniquely determined by next event

                    eventactions

                    3 Transport Layer 23Comp 361 Spring 2005

                    Incremental Improvements

                    rdt10 assumes every packet sent arrives and no errors introduced in transmission

                    rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                    rdt21 deals with corrupted ACKSNAKS

                    rdt22 like rdt21 but does not need NAKs

                    Rdt30 Allows packets to be lost

                    Rdt10 reliable transfer over a reliable channel

                    underlying channel perfectly reliableno bit errorsno loss of packets

                    separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                    Wait for call from above packet = make_pkt(data)

                    udt_send(packet)

                    rdt_send(data)extract (packetdata)deliver_data(data)

                    Wait for call from

                    below

                    rdt_rcv(packet)

                    sender receiver

                    3 Transport Layer 24Comp 361 Spring 2005

                    3 Transport Layer 25Comp 361 Spring 2005

                    Rdt20 channel with bit errors

                    underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                    the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                    new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                    3 Transport Layer 26Comp 361 Spring 2005

                    rdt20 FSM specification

                    Wait for call from above

                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                    udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                    udt_send(NAK)

                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                    Wait for ACK or

                    NAK

                    rdt_send(data)

                    receiver

                    Wait for call from

                    below

                    Λ

                    sender

                    3 Transport Layer 27Comp 361 Spring 2005

                    rdt20 operation with no errors

                    Wait for call from above

                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                    udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                    udt_send(NAK)

                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                    Wait for ACK or

                    NAK

                    Wait for call from

                    below

                    rdt_send(data)

                    Λ

                    3 Transport Layer 28Comp 361 Spring 2005

                    rdt20 error scenario

                    Wait for call from above

                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                    udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                    udt_send(NAK)

                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                    Wait for ACK or

                    NAK

                    Wait for call from

                    below

                    rdt_send(data)

                    Λ

                    3 Transport Layer 29Comp 361 Spring 2005

                    rdt20 has a fatal flawWhat happens if ACKNAK

                    corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                    What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                    Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                    Sender sends one packet then waits for receiver response

                    stop and wait

                    3 Transport Layer 30Comp 361 Spring 2005

                    Sender whenever sender receives control message it sends a packet to receiver

                    A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                    Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                    Note ACKNAK do not contain sequence

                    3 Transport Layer 31Comp 361 Spring 2005

                    rdt21 sender handles garbled ACKNAKs

                    Wait for call 0 from

                    above

                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                    rdt_send(data)

                    Wait for ACK or NAK 0 udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                    rdt_send(data)

                    udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                    Wait forcall 1 from

                    above

                    Wait for ACK or NAK 1

                    ΛΛ

                    3 Transport Layer 32Comp 361 Spring 2005

                    rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                    ampamp has_seq0(rcvpkt)

                    Wait for 0 from below

                    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                    Wait for 1 from below

                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                    3 Transport Layer 33Comp 361 Spring 2005

                    rdt21 discussion

                    Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                    state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                    Receivermust check if received packet is duplicate

                    state indicates whether 0 or 1 is expected pkt seq

                    note receiver can notknow if its last ACKNAK received OK at sender

                    3 Transport Layer 34Comp 361 Spring 2005

                    rdt22 a NAK-free protocol

                    same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                    receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                    duplicate ACK at sender results in same action as NAK retransmit current pkt

                    3 Transport Layer 35Comp 361 Spring 2005

                    rdt22 sender receiver fragments

                    Wait for call 0 from

                    above

                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                    rdt_send(data)

                    udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                    isACK(rcvpkt1) )

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                    Wait for ACK

                    0sender FSM

                    fragment

                    Wait for 0 from below

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                    has_seq1(rcvpkt))

                    udt_send(sndpkt)receiver FSM

                    fragment

                    Λ

                    3 Transport Layer 36Comp 361 Spring 2005

                    rdt30 channels with errors and loss

                    New assumptionunderlying channel can also lose packets (data or ACKs)

                    checksum seq ACKs retransmissions will be of help but not enough

                    Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                    Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                    retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                    requires countdown timer

                    3 Transport Layer 37Comp 361 Spring 2005

                    rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                    rdt_send(data)

                    Wait for

                    ACK0

                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                    Wait for call 1 from

                    above

                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                    rdt_send(data)

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                    stop_timerstop_timer

                    udt_send(sndpkt)start_timer

                    timeout

                    udt_send(sndpkt)start_timer

                    timeout

                    rdt_rcv(rcvpkt)

                    Wait for call 0from

                    above

                    Wait for

                    ACK1

                    Λrdt_rcv(rcvpkt)

                    ΛΛ

                    Λ

                    3 Transport Layer 38Comp 361 Spring 2005

                    rdt30 in action

                    3 Transport Layer 39Comp 361 Spring 2005

                    rdt30 in action

                    3 Transport Layer 40Comp 361 Spring 2005

                    Performance of rdt30

                    rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                    L (packet length in bits)R (transmission rate bps)

                    8kbpkt109 bsec

                    Ttransmit = = = 8 microsec

                    U sender =

                    00830008

                    = 000027 L R RTT + L R

                    =

                    U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                    rdt30 stop-and-wait operation

                    first packet bit transmitted t = 0

                    sender receiver

                    RTT

                    last packet bit transmitted t = L R

                    first packet bit arriveslast packet bit arrives send ACK

                    ACK arrives send next packet t = RTT + L R

                    U sender =

                    008 30008

                    = 000027 L R RTT + L R

                    =

                    3 Transport Layer 41Comp 361 Spring 2005

                    3 Transport Layer 42Comp 361 Spring 2005

                    Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                    range of sequence numbers must be increasedbuffering at sender andor receiver

                    3 Transport Layer 43Comp 361 Spring 2005

                    Pipelined protocols

                    Advantage much better bandwidth utilization than stop-and-wait

                    Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                    Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                    Note TCP is not exactly either

                    Pipelining increased utilization

                    first packet bit transmitted t = 0

                    sender receiver

                    RTT

                    last bit transmitted t = L R

                    first packet bit arriveslast packet bit arrives send ACK

                    ACK arrives send next packet t = RTT + L R

                    last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                    U sender =

                    02430008

                    = 00008 3 L R RTT + L R

                    =

                    Increase utilizationby a factor of 3

                    3 Transport Layer 44Comp 361 Spring 2005

                    3 Transport Layer 45Comp 361 Spring 2005

                    Go-Back-NSender

                    k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                    ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                    Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                    3 Transport Layer 46Comp 361 Spring 2005

                    GBN Sender

                    rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                    Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                    Timeout resends ALL packets that have been sent but not yet acknowledged

                    This is only event that triggers resend

                    3 Transport Layer 47Comp 361 Spring 2005

                    GBN sender extended FSMrdt_send(data)

                    Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                    timeout

                    if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                    start_timernextseqnum++

                    elserefuse_data(data)

                    base = getacknum(rcvpkt)+1If (base == nextseqnum)

                    stop_timerelse

                    start_timer

                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                    base=1nextseqnum=1

                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                    Λ

                    3 Transport Layer 48Comp 361 Spring 2005

                    GBN receiver extended FSM

                    Wait

                    udt_send(sndpkt)default

                    rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                    expectedseqnum=1sndpkt =

                    make_pkt(0ACKchksum)

                    Λ

                    If expected packet receivedSend ACK and deliver packet upstairs

                    If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                    3 Transport Layer 49Comp 361 Spring 2005

                    More on receiver

                    The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                    3 Transport Layer 50Comp 361 Spring 2005

                    GBN inaction

                    GBN is easy to code but might have performance problems

                    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                    3 Transport Layer 51Comp 361 Spring 2005

                    3 Transport Layer 52Comp 361 Spring 2005

                    Selective Repeat

                    receiver individually acknowledges all correctly received pkts

                    buffers pkts as needed for eventual in-order delivery to upper layer

                    sender only resends pkts for which ACK not received

                    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                    3 Transport Layer 53Comp 361 Spring 2005

                    Selective repeat sender receiver windows

                    3 Transport Layer 54Comp 361 Spring 2005

                    Selective repeat

                    pkt n in [rcvbase rcvbase+N-1]

                    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                    pkt n in [rcvbase-Nrcvbase-1]

                    ACK(n) (note this is a reACK)

                    otherwiseignore

                    receiverdata from above

                    if next available seq in window send pkt

                    timeout(n)resend pkt n restart timer

                    ACK(n) in [sendbasesendbase+N]

                    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                    sender

                    3 Transport Layer 55Comp 361 Spring 2005

                    Selective repeat in action

                    3 Transport Layer 56Comp 361 Spring 2005

                    Selective repeatdilemma

                    Example seq rsquos 0 1 2 3window size=3

                    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                    Q what is relationship between seq size and window size

                    3 Transport Layer 57Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 58Comp 361 Spring 2005

                    TCP Overview RFCs 793 1122 1323 2018 2581

                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                    flow controlledsender will not overwhelm receiver

                    point-to-pointone sender one receiver

                    reliable in-order byte steam

                    no ldquomessage boundariesrdquopipelined

                    TCP congestion and flow control set window size

                    send amp receive buffers

                    socketdoor

                    TCPsend buffer

                    TCPreceive buffer

                    socketdoor

                    segment

                    applicationwrites data

                    applicationreads data

                    3 Transport Layer 59Comp 361 Spring 2005

                    More TCP DetailsMaximum Segment Size (MSS)

                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                    Application Data + TCP Header = TCP Segment

                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                    (again no payload)Client responds with third special segment

                    This can contain payload

                    3 Transport Layer 60Comp 361 Spring 2005

                    Even More TCP Details

                    A TCP connection between client and server creates in both client and server

                    (i) buffers(ii) variables and

                    (iii) a socket connection to process

                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                    any of the network elements between the host and server

                    3 Transport Layer 61Comp 361 Spring 2005

                    TCP segment structure

                    source port dest port

                    32 bits

                    applicationdata

                    (variable length)

                    sequence numberacknowledgement number

                    Receive windowUrg data pnterchecksum

                    FSRPAUheadlen

                    notused

                    Options (variable length)

                    URG urgent data (generally not used)

                    ACK ACK valid

                    PSH push data now(generally not used)

                    RST SYN FINconnection estab(setup teardown

                    commands)

                    bytes rcvr willingto accept

                    Internetchecksum

                    (as in UDP)

                    countingby bytes of data(not segments)

                    3 Transport Layer 62Comp 361 Spring 2005

                    TCP seq rsquos and ACKsSeq rsquos

                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                    ACKsseq of next byte expected from other sidecumulative ACK

                    Q how receiver handles out-of-order segments

                    A TCP spec doesnrsquot say - up to implementer

                    Host BHost A

                    Seq=42 ACK=79 data = lsquoCrsquo

                    Seq=79 ACK=43 data = lsquoCrsquo

                    Seq=43 ACK=80

                    Usertypes

                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                    back lsquoCrsquo

                    host ACKsreceipt

                    of echoedlsquoCrsquo

                    timesimple telnet scenario

                    3 Transport Layer 63Comp 361 Spring 2005

                    TCP Round Trip Time and Timeout

                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                    average several recent measurements not just current SampleRTT

                    Q how to set TCP timeout valuelonger than RTT

                    but RTT variestoo short premature timeout

                    unnecessary retransmissions

                    too long slow reaction to segment loss

                    3 Transport Layer 64Comp 361 Spring 2005

                    TCP Round Trip Time and Timeout

                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                    3 Transport Layer 65Comp 361 Spring 2005

                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                    100

                    150

                    200

                    250

                    300

                    350

                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                    time (seconnds)

                    RTT

                    (mill

                    iseco

                    nds)

                    SampleRTT Estimated RTT

                    3 Transport Layer 66Comp 361 Spring 2005

                    TCP Round Trip Time and Timeout

                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                    (typically β = 025)

                    Then set timeout interval

                    TimeoutInterval = EstimatedRTT + 4DevRTT

                    3 Transport Layer 67Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 68Comp 361 Spring 2005

                    TCP reliable data transfer

                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                    Retransmissions are triggered by

                    timeout eventsduplicate acks

                    Initially consider simplified TCP sender

                    ignore duplicate acksignore flow control congestion control

                    3 Transport Layer 69Comp 361 Spring 2005

                    TCP sender eventsdata rcvd from app

                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                    timeoutretransmit segment that caused timeoutrestart timer

                    Ack rcvdIf acknowledges previously unackedsegments

                    update what is known to be ackedstart timer if there are outstanding segments

                    TCP sender(simplified)

                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                    loop (forever) switch(event)

                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                    event timer timeoutretransmit not-yet-acknowledged segment with

                    smallest sequence numberstart timer

                    event ACK received with ACK field value of y if (y gt SendBase)

                    SendBase = yif (there are currently not-yet-acknowledged segments)

                    start timer

                    end of loop forever

                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                    3 Transport Layer 70Comp 361 Spring 2005

                    3 Transport Layer 71Comp 361 Spring 2005

                    TCP retransmission scenariosHost A

                    Seq=100 20 bytes data

                    ACK=100

                    timepremature timeout

                    Host B

                    Seq=92 8 bytes data

                    ACK=120

                    Seq=92 8 bytes data

                    Seq=

                    92 t

                    imeo

                    ut

                    ACK=120

                    Host A

                    Seq=92 8 bytes data

                    ACK=100

                    loss

                    tim

                    eout

                    lost ACK scenario

                    Host B

                    X

                    Seq=92 8 bytes data

                    ACK=100

                    time

                    SendBase= 120

                    SendBase= 120

                    Sendbase= 100

                    Seq=

                    92 t

                    imeo

                    utSendBase

                    = 100

                    3 Transport Layer 72Comp 361 Spring 2005

                    TCP retransmission scenarios (more)Host A

                    Seq=92 8 bytes data

                    ACK=100

                    loss

                    tim

                    eout

                    Cumulative ACK scenario

                    Host B

                    X

                    Seq=100 20 bytes data

                    ACK=120

                    time

                    SendBase= 120

                    3 Transport Layer 73Comp 361 Spring 2005

                    TCP ACK generation [RFC 1122 RFC 2581]

                    Event at Receiver

                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                    Arrival of segment that partially or completely fills gap

                    TCP Receiver action

                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                    Immediately send single cumulative ACK ACKing both in-order segments

                    Immediately send duplicate ACK indicating seq of next expected byte

                    Immediate send ACK provided thatsegment starts at lower end of gap

                    3 Transport Layer 74Comp 361 Spring 2005

                    More on Sender Policies

                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                    3 Transport Layer 75Comp 361 Spring 2005

                    Fast Retransmit

                    Time-out period often relatively long

                    long delay before resending lost packet

                    Detect lost segments via duplicate ACKs

                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                    fast retransmit resend segment before timer expires

                    3 Transport Layer 76Comp 361 Spring 2005

                    Fast retransmit algorithm

                    event ACK received with ACK field value of y if (y gt SendBase)

                    SendBase = yif (there are currently not-yet-acknowledged segments)

                    start timer

                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                    resend segment with sequence number y

                    a duplicate ACK for already ACKed segment

                    fast retransmit

                    3 Transport Layer 77Comp 361 Spring 2005

                    TCP GBN or Selective Repeat

                    Basic TCP looks a lot like GBN

                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                    This looks a lot like Selective Repeat

                    TCP is a hybrid

                    3 Transport Layer 78Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 79Comp 361 Spring 2005

                    TCP Flow Control

                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                    3 Transport Layer 80Comp 361 Spring 2005

                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                    transmitting too muchtoo fast

                    flow controlreceive side of TCP connection has a receive buffer

                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                    app process may be slow at reading from buffer

                    3 Transport Layer 81Comp 361 Spring 2005

                    TCP segment structure

                    source port dest port

                    32 bits

                    applicationdata

                    (variable length)

                    sequence numberacknowledgement number

                    Receive windowUrg data pnterchecksum

                    FSRPAUheadlen

                    notused

                    Options (variable length)

                    URG urgent data (generally not used)

                    ACK ACK valid

                    PSH push data now(generally not used)

                    RST SYN FINconnection estab(setup teardown

                    commands)

                    bytes rcvr willingto accept

                    Internetchecksum

                    (as in UDP)

                    countingby bytes of data(not segments)

                    3 Transport Layer 82Comp 361 Spring 2005

                    TCP Flow control how it works

                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                    LastByteRead]

                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                    guarantees receive buffer doesnrsquot overflow

                    3 Transport Layer 83Comp 361 Spring 2005

                    Technical Issue

                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                    3 Transport Layer 84Comp 361 Spring 2005

                    Note on UDP

                    UDP has no flow control

                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                    3 Transport Layer 85Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 86Comp 361 Spring 2005

                    TCP Connection Management

                    Three way handshakeStep 1 client end system sends

                    TCP SYN control segment to server

                    specifies client_isn the initial seq No application data

                    Step 2 server end system receives SYN replies with SYNACK control segment

                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                    seq sbuffers flow control info (eg RcvWindow)

                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                    3 Transport Layer 87Comp 361 Spring 2005

                    TCP Connection Management (cont)

                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                    Allocate buffersAllocates buffersCan include application data

                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                    clientConnection request (SYN=1 seq=client_isn)

                    server

                    Connection granted (SYN=1 server_isn

                    ACK (SYN=0 seq=client_isn+1)

                    ack=client_isn+1)

                    ack=server_isn+1

                    3 Transport Layer 88Comp 361 Spring 2005

                    TCP Connection Management (cont)

                    Closing a connection

                    client closes socketclientSocketclose()

                    Step 1 client end system sends TCP FIN control segment to server

                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                    client

                    FIN

                    server

                    ACK

                    ACK

                    FIN

                    close

                    close

                    closed

                    tim

                    ed w

                    ait

                    3 Transport Layer 89Comp 361 Spring 2005

                    TCP Connection Management (cont)

                    Step 3 client receives FIN replies with ACK

                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                    Closes down after timed-wait

                    Step 4 server receives ACK Connection closed

                    Note with small modification can handle simultaneous FINs

                    client

                    FIN

                    server

                    ACK

                    ACK

                    FIN

                    closing

                    closing

                    closed

                    tim

                    ed w

                    ait

                    closed

                    3 Transport Layer 90Comp 361 Spring 2005

                    TCP Connection Management (cont)

                    ExampleTCP serverlifecycle

                    Example TCP clientlifecycle

                    3 Transport Layer 91Comp 361 Spring 2005

                    A few special cases

                    Have not discussed what happens if both client and server decide to close down connection at same time

                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                    3 Transport Layer 92Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 93Comp 361 Spring 2005

                    Principles of Congestion Control

                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                    a top-10 problem

                    3 Transport Layer 94Comp 361 Spring 2005

                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                    large delays when congestedmaximum achievable throughput

                    3 Transport Layer 95Comp 361 Spring 2005

                    Causescosts of congestion scenario 2

                    one router finite buffers sender retransmission of lost packet

                    3 Transport Layer 96Comp 361 Spring 2005

                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                    λin λout=

                    λin λoutgtλ

                    inλout

                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                    (c)(a) (b)

                    3 Transport Layer 97Comp 361 Spring 2005

                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                    λin

                    Q what happens as and increase λ

                    in

                    3 Transport Layer 98Comp 361 Spring 2005

                    Causescosts of congestion scenario 3

                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                    3 Transport Layer 99Comp 361 Spring 2005

                    Approaches towards congestion control

                    Two broad approaches towards congestion control

                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                    Network-assisted congestion controlrouters provide feedback to end systems

                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                    3 Transport Layer 100Comp 361 Spring 2005

                    Case study ATM ABR congestion control

                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                    RM cells returned to sender by receiver with bits intact

                    small exception ndash see next page

                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                    sender should use available bandwidth

                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                    3 Transport Layer 101Comp 361 Spring 2005

                    Case study ATM ABR congestion control

                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                    3 Transport Layer 102Comp 361 Spring 2005

                    Chapter 3 outline

                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                    35 Connection-oriented transport TCP

                    segment structurereliable data transferflow controlconnection management

                    36 Principles of congestion control37 TCP congestion control

                    3 Transport Layer 103Comp 361 Spring 2005

                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                    Congwin

                    w segments each with MSS bytes sent in one RTT

                    throughput = w MSSRTT Bytessec

                    3 Transport Layer 104Comp 361 Spring 2005

                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                    LastByteSent-LastByteAcked le CongWin

                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                    3 Transport Layer 105Comp 361 Spring 2005

                    TCP AIMDmultiplicative decrease additive increase increase

                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                    cut CongWin in half after loss event

                    8 Kbytes

                    16 Kbytes

                    24 Kbytes

                    time

                    congestionwindow

                    Long-lived TCP connection

                    3 Transport Layer 106Comp 361 Spring 2005

                    TCP Slow Start

                    When connection begins CongWin = 1 MSS

                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                    available bandwidth may be gtgt MSSRTT

                    desirable to quickly ramp up to respectable rate

                    When connection begins increase rate exponentially fast until first loss event

                    3 Transport Layer 107Comp 361 Spring 2005

                    TCP Slow Start (more)

                    When connection begins increase rate exponentially until first loss event

                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                    Summary initial rate is slow but ramps up exponentially fast

                    Host A

                    one segment

                    RTT

                    Host B

                    time

                    two segments

                    four segments

                    3 Transport Layer 108Comp 361 Spring 2005

                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                    3 Transport Layer 109Comp 361 Spring 2005

                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                    3 Transport Layer 110Comp 361 Spring 2005

                    Summary TCP Congestion Control

                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                    3 Transport Layer 111Comp 361 Spring 2005

                    The Big Picture

                    3 Transport Layer 112Comp 361 Spring 2005

                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                    ACK receipt for previously unackeddata

                    Slow Start (SS)

                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                    set state to ldquoCongestion Avoidancerdquo

                    Resulting in a doubling of CongWin every RTT

                    ACK receipt for previously unackeddata

                    CongestionAvoidance (CA)

                    CongWin = CongWin+MSS (MSSCongWin)

                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                    Loss event detected by triple duplicate ACK

                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                    Enter slow start

                    Duplicate ACK

                    SS or CA Increment duplicate ACK count for segment being acked

                    CongWin and Threshold not changed

                    3 Transport Layer 113Comp 361 Spring 2005

                    TCP throughput

                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                    3 Transport Layer 114Comp 361 Spring 2005

                    TCP Futures

                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                    L = 210-10 WowNew versions of TCP for high-speed needed

                    LRTTMSSsdot221

                    3 Transport Layer 115Comp 361 Spring 2005

                    TCP FairnessFairness goal if K TCP sessions share same

                    bottleneck link of bandwidth R each should have average rate of RK

                    TCP connection 1

                    bottleneckrouter

                    capacity R

                    TCP connection 2

                    3 Transport Layer 116Comp 361 Spring 2005

                    Why is TCP fairTwo competing sessions

                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                    R

                    R

                    equal bandwidth share

                    Connection 1 throughput

                    Conn

                    ecti

                    on 2

                    thr

                    ough

                    p ut

                    congestion avoidance additive increaseloss decrease window by factor of 2

                    congestion avoidance additive increaseloss decrease window by factor of 2

                    3 Transport Layer 117Comp 361 Spring 2005

                    Fairness (more)Fairness and UDP

                    Multimedia apps often do not use TCP

                    do not want rate throttled by congestion control

                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                    Current Research area How to keep UDP from congesting the internet

                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                    3 Transport Layer 118Comp 361 Spring 2005

                    TCP Latency ModelingNotation assumptions

                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                    modeling slow start

                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                    3 Transport Layer 119Comp 361 Spring 2005

                    Fixed Congestion Window (W)Two cases

                    1 WSR gt RTT + SR ACK for first segment in window returns before

                    windowrsquos worth of data sentLatency = 2RTT + OR

                    2 WSR lt RTT + SR ACK for first segment in window returns after

                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                    3 Transport Layer 120Comp 361 Spring 2005

                    Fixed congestion window (1)

                    First caseWSR gt RTT + SR ACK for

                    first segment in window returns before windowrsquos worth of data sent

                    latency = 2RTT + OR

                    3 Transport Layer 121Comp 361 Spring 2005

                    Fixed congestion window (2)

                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                    3 Transport Layer 122Comp 361 Spring 2005

                    TCP Latency Modeling Slow Start (1)

                    Now suppose window grows according to slow start(with no threshold and no loss events)

                    Will show that the delay for one object is

                    RS

                    RSRTTP

                    RORTTLatency P )12(2 minusminus⎥⎦

                    ⎤⎢⎣⎡ +++=

                    where P is the number of times TCP idles at server1min minus= KQP

                    - where Q is the number of times the server idlesif the object were of infinite size

                    - and K is the number of windows that cover the object

                    3 Transport Layer 123Comp 361 Spring 2005

                    TCP Latency Modeling Slow Start (2)

                    RTT

                    initiate TCPconnection

                    requestobject

                    first window= SR

                    second window= 2SR

                    third window= 4SR

                    fourth window= 8SR

                    completetransmissionobject

                    delivered

                    time atclient

                    time atserver

                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                    Server idles P=2 times

                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                    Server idles P = minK-1Q times

                    3 Transport Layer 124Comp 361 Spring 2005

                    TCP Latency Modeling (3)

                    ementacknowledg receivesserver until

                    segment send tostartsserver whenfrom time=+ RTTRS

                    RS

                    RSRTTPRTT

                    RO

                    RSRTT

                    RSRTT

                    RO

                    idleTimeRTTRO

                    P

                    kP

                    k

                    P

                    pp

                    )12(][2

                    ]2[2

                    2delay

                    1

                    1

                    1

                    minusminus+++=

                    minus+++=

                    ++=

                    minus

                    =

                    =

                    sum

                    sum

                    th window after the timeidle 2 1 kRSRTT

                    RS k =⎥⎦

                    ⎤⎢⎣⎡ minus+

                    +minus

                    window kth the transmit totime2 1 =minus

                    RSk

                    RTT

                    initiate TCPconnection

                    requestobject

                    first window= SR

                    second window= 2SR

                    third window= 4SR

                    fourth window= 8SR

                    completetransmissionobject

                    delivered

                    time atclient

                    time atserver

                    3 Transport Layer 125Comp 361 Spring 2005

                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                    How do we calculate K

                    ⎥⎥⎤

                    ⎢⎢⎡ +=

                    +ge=

                    geminus=

                    ge+++=

                    ge+++=minus

                    minus

                    )1(log

                    )1(logmin

                    12min

                    222min222min

                    2

                    2

                    110

                    110

                    SO

                    SOkk

                    SOk

                    SOkOSSSkK

                    k

                    k

                    k

                    L

                    L

                    Calculation of Q number of idles for infinite-size objectis similar

                    3 Transport Layer 126Comp 361 Spring 2005

                    HTTP ModelingAssume Web page consists of

                    1 base HTML page (of size O bits)M images (each of size O bits)

                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                    3 Transport Layer 127Comp 361 Spring 2005

                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                    02468

                    101214161820

                    28Kbps

                    100Kbps

                    1 Mbps 10Mbps

                    non-persistent

                    persistent

                    parallel non-persistent

                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                    3 Transport Layer 128Comp 361 Spring 2005

                    HTTP Response time (in seconds)

                    0

                    10

                    20

                    30

                    40

                    50

                    60

                    70

                    28Kbps

                    100Kbps

                    1 Mbps 10Mbps

                    non-persistent

                    persistent

                    parallel non-persistent

                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                    3 Transport Layer 129Comp 361 Spring 2005

                    Chapter 3 Summaryprinciples behind transport layer services

                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                    instantiation and implementation in the Internet

                    UDPTCP

                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                    • Chapter 3 Transport Layer last revised 160305
                    • Chapter 3 outline
                    • Transport services and protocols
                    • Transport vs network layer
                    • Transport-layer protocols
                    • Chapter 3 outline
                    • Multiplexingdemultiplexing
                    • Multiplexingdemultiplexing
                    • How demultiplexing works
                    • Connectionless demultiplexing
                    • Connectionless demux (cont)
                    • Connection-oriented demux
                    • Connection-oriented demux (cont)
                    • Connection-oriented demux Threaded Web Server
                    • Chapter 3 outline
                    • UDP User Datagram Protocol [RFC 768]
                    • UDP more
                    • UDP checksum
                    • Chapter 3 outline
                    • Principles of Reliable data transfer
                    • Reliable data transfer getting started
                    • Reliable data transfer getting started
                    • Incremental Improvements
                    • Rdt10 reliable transfer over a reliable channel
                    • Rdt20 channel with bit errors
                    • rdt20 FSM specification
                    • rdt20 operation with no errors
                    • rdt20 error scenario
                    • rdt20 has a fatal flaw
                    • rdt21 sender handles garbled ACKNAKs
                    • rdt21 receiver handles garbled ACKNAKs
                    • rdt21 discussion
                    • rdt22 a NAK-free protocol
                    • rdt22 sender receiver fragments
                    • rdt30 channels with errors and loss
                    • rdt30 sender
                    • rdt30 in action
                    • rdt30 in action
                    • Performance of rdt30
                    • rdt30 stop-and-wait operation
                    • Pipelined protocols
                    • Pipelined protocols
                    • Pipelining increased utilization
                    • Go-Back-N
                    • GBN Sender
                    • GBN sender extended FSM
                    • GBN receiver extended FSM
                    • More on receiver
                    • GBN inaction
                    • Selective Repeat
                    • Selective repeat sender receiver windows
                    • Selective repeat
                    • Selective repeat in action
                    • Selective repeat dilemma
                    • Chapter 3 outline
                    • TCP Overview RFCs 793 1122 1323 2018 2581
                    • More TCP Details
                    • Even More TCP Details
                    • TCP segment structure
                    • TCP seq rsquos and ACKs
                    • TCP Round Trip Time and Timeout
                    • TCP Round Trip Time and Timeout
                    • Example RTT estimation
                    • TCP Round Trip Time and Timeout
                    • Chapter 3 outline
                    • TCP reliable data transfer
                    • TCP sender events
                    • TCP sender(simplified)
                    • TCP retransmission scenarios
                    • TCP retransmission scenarios (more)
                    • TCP ACK generation [RFC 1122 RFC 2581]
                    • More on Sender Policies
                    • Fast Retransmit
                    • Fast retransmit algorithm
                    • TCP GBN or Selective Repeat
                    • Chapter 3 outline
                    • TCP Flow Control
                    • TCP Flow Control
                    • TCP segment structure
                    • TCP Flow control how it works
                    • Technical Issue
                    • Chapter 3 outline
                    • TCP Connection Management
                    • TCP Connection Management (cont)
                    • TCP Connection Management (cont)
                    • TCP Connection Management (cont)
                    • TCP Connection Management (cont)
                    • A few special cases
                    • Chapter 3 outline
                    • Principles of Congestion Control
                    • Causescosts of congestion scenario 1
                    • Causescosts of congestion scenario 2
                    • Causescosts of congestion scenario 3
                    • Causescosts of congestion scenario 3
                    • Approaches towards congestion control
                    • Case study ATM ABR congestion control
                    • Case study ATM ABR congestion control
                    • Chapter 3 outline
                    • TCP Congestion Control
                    • TCP AIMD
                    • TCP Slow Start
                    • TCP Slow Start (more)
                    • Summary TCP Congestion Control
                    • The Big Picture
                    • TCP sender congestion control
                    • TCP throughput
                    • TCP Futures
                    • TCP Fairness
                    • Why is TCP fair
                    • Fairness (more)
                    • TCP Latency Modeling
                    • Fixed Congestion Window (W)
                    • Fixed congestion window (1)
                    • Fixed congestion window (2)
                    • TCP Latency Modeling Slow Start (1)
                    • TCP Latency Modeling Slow Start (2)
                    • TCP Latency Modeling (3)
                    • TCP Latency Modeling (4)
                    • HTTP Modeling
                    • Chapter 3 Summary

                      3 Transport Layer 11Comp 361 Spring 2005

                      Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428)

                      ClientIPB

                      P3

                      clientIP A

                      P1P1P3

                      serverIP C

                      SP 6428DP 9157

                      SP 9157DP 6428

                      SP 6428DP 5775

                      SP 5775DP 6428

                      SP provides ldquoreturn addressrdquo

                      3 Transport Layer 12Comp 361 Spring 2005

                      Connection-oriented demux

                      TCP socket identified by 4-tuple

                      source IP addresssource port numberdest IP addressdest port number

                      recv host uses all four values to direct segment to appropriate socket

                      Server host may support many simultaneous TCP sockets

                      each socket identified by its own 4-tuple

                      Web servers have different sockets for each connecting client

                      non-persistent HTTP will have different socket for each request

                      3 Transport Layer 13Comp 361 Spring 2005

                      Connection-oriented demux(cont)

                      ClientIPB

                      P3

                      clientIP A

                      P1P1P3

                      serverIP C

                      SP 80DP 9157

                      SP 9157DP 80

                      SP 80DP 5775

                      SP 5775DP 80

                      P4

                      3 Transport Layer 14Comp 361 Spring 2005

                      Connection-oriented demux Threaded Web Server

                      ClientIPB

                      P1

                      clientIP A

                      P1P2

                      serverIP C

                      SP 9157DP 80

                      SP 9157DP 80

                      P4 P3

                      D-IPCS-IP AD-IPC

                      S-IP B

                      SP 5775DP 80

                      D-IPCS-IP B

                      3 Transport Layer 15Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 16Comp 361 Spring 2005

                      UDP User Datagram Protocol [RFC 768]

                      ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                      lostdelivered out of order to app

                      connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                      Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                      3 Transport Layer 17Comp 361 Spring 2005

                      UDP moreoften used for streaming multimedia apps

                      loss tolerantrate sensitive

                      other UDP uses (why)

                      DNS small delaySNMP stressful cond

                      reliable transfer over UDP add reliability at application layer

                      application-specific error recover

                      source port dest port

                      32 bits

                      Applicationdata

                      (message)

                      length checksumLength in

                      bytes of UDPsegmentincluding

                      header

                      UDP segment format

                      3 Transport Layer 18Comp 361 Spring 2005

                      UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                      segment

                      Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                      NO - error detectedYES - no error detected But maybe errors nonetheless More later

                      Receiver may choose to discard segment or send a warning to app in case error

                      Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                      3 Transport Layer 19Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 20Comp 361 Spring 2005

                      Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                      characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                      3 Transport Layer 21Comp 361 Spring 2005

                      Reliable data transfer getting started

                      sendside

                      receiveside

                      rdt_send() called from above (eg by app) Passed data to

                      deliver to receiver upper layer

                      udt_send() called by rdtto transfer packet over

                      unreliable channel to receiver

                      rdt_rcv() called when packet arrives on rcv-side of channel

                      deliver_data() called by rdt to deliver data to upper

                      3 Transport Layer 22Comp 361 Spring 2005

                      Reliable data transfer getting startedWersquoll

                      incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                      but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                      state1

                      state2

                      event causing state transitionactions taken on state transition

                      state when in this ldquostaterdquo next state

                      uniquely determined by next event

                      eventactions

                      3 Transport Layer 23Comp 361 Spring 2005

                      Incremental Improvements

                      rdt10 assumes every packet sent arrives and no errors introduced in transmission

                      rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                      rdt21 deals with corrupted ACKSNAKS

                      rdt22 like rdt21 but does not need NAKs

                      Rdt30 Allows packets to be lost

                      Rdt10 reliable transfer over a reliable channel

                      underlying channel perfectly reliableno bit errorsno loss of packets

                      separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                      Wait for call from above packet = make_pkt(data)

                      udt_send(packet)

                      rdt_send(data)extract (packetdata)deliver_data(data)

                      Wait for call from

                      below

                      rdt_rcv(packet)

                      sender receiver

                      3 Transport Layer 24Comp 361 Spring 2005

                      3 Transport Layer 25Comp 361 Spring 2005

                      Rdt20 channel with bit errors

                      underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                      the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                      new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                      3 Transport Layer 26Comp 361 Spring 2005

                      rdt20 FSM specification

                      Wait for call from above

                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                      udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                      udt_send(NAK)

                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                      Wait for ACK or

                      NAK

                      rdt_send(data)

                      receiver

                      Wait for call from

                      below

                      Λ

                      sender

                      3 Transport Layer 27Comp 361 Spring 2005

                      rdt20 operation with no errors

                      Wait for call from above

                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                      udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                      udt_send(NAK)

                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                      Wait for ACK or

                      NAK

                      Wait for call from

                      below

                      rdt_send(data)

                      Λ

                      3 Transport Layer 28Comp 361 Spring 2005

                      rdt20 error scenario

                      Wait for call from above

                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                      udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                      udt_send(NAK)

                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                      Wait for ACK or

                      NAK

                      Wait for call from

                      below

                      rdt_send(data)

                      Λ

                      3 Transport Layer 29Comp 361 Spring 2005

                      rdt20 has a fatal flawWhat happens if ACKNAK

                      corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                      What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                      Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                      Sender sends one packet then waits for receiver response

                      stop and wait

                      3 Transport Layer 30Comp 361 Spring 2005

                      Sender whenever sender receives control message it sends a packet to receiver

                      A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                      Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                      Note ACKNAK do not contain sequence

                      3 Transport Layer 31Comp 361 Spring 2005

                      rdt21 sender handles garbled ACKNAKs

                      Wait for call 0 from

                      above

                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                      rdt_send(data)

                      Wait for ACK or NAK 0 udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                      rdt_send(data)

                      udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                      Wait forcall 1 from

                      above

                      Wait for ACK or NAK 1

                      ΛΛ

                      3 Transport Layer 32Comp 361 Spring 2005

                      rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                      ampamp has_seq0(rcvpkt)

                      Wait for 0 from below

                      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                      Wait for 1 from below

                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                      3 Transport Layer 33Comp 361 Spring 2005

                      rdt21 discussion

                      Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                      state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                      Receivermust check if received packet is duplicate

                      state indicates whether 0 or 1 is expected pkt seq

                      note receiver can notknow if its last ACKNAK received OK at sender

                      3 Transport Layer 34Comp 361 Spring 2005

                      rdt22 a NAK-free protocol

                      same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                      receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                      duplicate ACK at sender results in same action as NAK retransmit current pkt

                      3 Transport Layer 35Comp 361 Spring 2005

                      rdt22 sender receiver fragments

                      Wait for call 0 from

                      above

                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                      rdt_send(data)

                      udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                      isACK(rcvpkt1) )

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                      Wait for ACK

                      0sender FSM

                      fragment

                      Wait for 0 from below

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                      has_seq1(rcvpkt))

                      udt_send(sndpkt)receiver FSM

                      fragment

                      Λ

                      3 Transport Layer 36Comp 361 Spring 2005

                      rdt30 channels with errors and loss

                      New assumptionunderlying channel can also lose packets (data or ACKs)

                      checksum seq ACKs retransmissions will be of help but not enough

                      Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                      Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                      retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                      requires countdown timer

                      3 Transport Layer 37Comp 361 Spring 2005

                      rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                      rdt_send(data)

                      Wait for

                      ACK0

                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                      Wait for call 1 from

                      above

                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                      rdt_send(data)

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                      stop_timerstop_timer

                      udt_send(sndpkt)start_timer

                      timeout

                      udt_send(sndpkt)start_timer

                      timeout

                      rdt_rcv(rcvpkt)

                      Wait for call 0from

                      above

                      Wait for

                      ACK1

                      Λrdt_rcv(rcvpkt)

                      ΛΛ

                      Λ

                      3 Transport Layer 38Comp 361 Spring 2005

                      rdt30 in action

                      3 Transport Layer 39Comp 361 Spring 2005

                      rdt30 in action

                      3 Transport Layer 40Comp 361 Spring 2005

                      Performance of rdt30

                      rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                      L (packet length in bits)R (transmission rate bps)

                      8kbpkt109 bsec

                      Ttransmit = = = 8 microsec

                      U sender =

                      00830008

                      = 000027 L R RTT + L R

                      =

                      U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                      rdt30 stop-and-wait operation

                      first packet bit transmitted t = 0

                      sender receiver

                      RTT

                      last packet bit transmitted t = L R

                      first packet bit arriveslast packet bit arrives send ACK

                      ACK arrives send next packet t = RTT + L R

                      U sender =

                      008 30008

                      = 000027 L R RTT + L R

                      =

                      3 Transport Layer 41Comp 361 Spring 2005

                      3 Transport Layer 42Comp 361 Spring 2005

                      Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                      range of sequence numbers must be increasedbuffering at sender andor receiver

                      3 Transport Layer 43Comp 361 Spring 2005

                      Pipelined protocols

                      Advantage much better bandwidth utilization than stop-and-wait

                      Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                      Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                      Note TCP is not exactly either

                      Pipelining increased utilization

                      first packet bit transmitted t = 0

                      sender receiver

                      RTT

                      last bit transmitted t = L R

                      first packet bit arriveslast packet bit arrives send ACK

                      ACK arrives send next packet t = RTT + L R

                      last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                      U sender =

                      02430008

                      = 00008 3 L R RTT + L R

                      =

                      Increase utilizationby a factor of 3

                      3 Transport Layer 44Comp 361 Spring 2005

                      3 Transport Layer 45Comp 361 Spring 2005

                      Go-Back-NSender

                      k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                      ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                      Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                      3 Transport Layer 46Comp 361 Spring 2005

                      GBN Sender

                      rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                      Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                      Timeout resends ALL packets that have been sent but not yet acknowledged

                      This is only event that triggers resend

                      3 Transport Layer 47Comp 361 Spring 2005

                      GBN sender extended FSMrdt_send(data)

                      Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                      timeout

                      if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                      start_timernextseqnum++

                      elserefuse_data(data)

                      base = getacknum(rcvpkt)+1If (base == nextseqnum)

                      stop_timerelse

                      start_timer

                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                      base=1nextseqnum=1

                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                      Λ

                      3 Transport Layer 48Comp 361 Spring 2005

                      GBN receiver extended FSM

                      Wait

                      udt_send(sndpkt)default

                      rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                      expectedseqnum=1sndpkt =

                      make_pkt(0ACKchksum)

                      Λ

                      If expected packet receivedSend ACK and deliver packet upstairs

                      If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                      3 Transport Layer 49Comp 361 Spring 2005

                      More on receiver

                      The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                      3 Transport Layer 50Comp 361 Spring 2005

                      GBN inaction

                      GBN is easy to code but might have performance problems

                      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                      3 Transport Layer 51Comp 361 Spring 2005

                      3 Transport Layer 52Comp 361 Spring 2005

                      Selective Repeat

                      receiver individually acknowledges all correctly received pkts

                      buffers pkts as needed for eventual in-order delivery to upper layer

                      sender only resends pkts for which ACK not received

                      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                      3 Transport Layer 53Comp 361 Spring 2005

                      Selective repeat sender receiver windows

                      3 Transport Layer 54Comp 361 Spring 2005

                      Selective repeat

                      pkt n in [rcvbase rcvbase+N-1]

                      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                      pkt n in [rcvbase-Nrcvbase-1]

                      ACK(n) (note this is a reACK)

                      otherwiseignore

                      receiverdata from above

                      if next available seq in window send pkt

                      timeout(n)resend pkt n restart timer

                      ACK(n) in [sendbasesendbase+N]

                      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                      sender

                      3 Transport Layer 55Comp 361 Spring 2005

                      Selective repeat in action

                      3 Transport Layer 56Comp 361 Spring 2005

                      Selective repeatdilemma

                      Example seq rsquos 0 1 2 3window size=3

                      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                      Q what is relationship between seq size and window size

                      3 Transport Layer 57Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 58Comp 361 Spring 2005

                      TCP Overview RFCs 793 1122 1323 2018 2581

                      full duplex databi-directional data flow in same connectionMSS maximum segment size

                      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                      flow controlledsender will not overwhelm receiver

                      point-to-pointone sender one receiver

                      reliable in-order byte steam

                      no ldquomessage boundariesrdquopipelined

                      TCP congestion and flow control set window size

                      send amp receive buffers

                      socketdoor

                      TCPsend buffer

                      TCPreceive buffer

                      socketdoor

                      segment

                      applicationwrites data

                      applicationreads data

                      3 Transport Layer 59Comp 361 Spring 2005

                      More TCP DetailsMaximum Segment Size (MSS)

                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                      Application Data + TCP Header = TCP Segment

                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                      (again no payload)Client responds with third special segment

                      This can contain payload

                      3 Transport Layer 60Comp 361 Spring 2005

                      Even More TCP Details

                      A TCP connection between client and server creates in both client and server

                      (i) buffers(ii) variables and

                      (iii) a socket connection to process

                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                      any of the network elements between the host and server

                      3 Transport Layer 61Comp 361 Spring 2005

                      TCP segment structure

                      source port dest port

                      32 bits

                      applicationdata

                      (variable length)

                      sequence numberacknowledgement number

                      Receive windowUrg data pnterchecksum

                      FSRPAUheadlen

                      notused

                      Options (variable length)

                      URG urgent data (generally not used)

                      ACK ACK valid

                      PSH push data now(generally not used)

                      RST SYN FINconnection estab(setup teardown

                      commands)

                      bytes rcvr willingto accept

                      Internetchecksum

                      (as in UDP)

                      countingby bytes of data(not segments)

                      3 Transport Layer 62Comp 361 Spring 2005

                      TCP seq rsquos and ACKsSeq rsquos

                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                      ACKsseq of next byte expected from other sidecumulative ACK

                      Q how receiver handles out-of-order segments

                      A TCP spec doesnrsquot say - up to implementer

                      Host BHost A

                      Seq=42 ACK=79 data = lsquoCrsquo

                      Seq=79 ACK=43 data = lsquoCrsquo

                      Seq=43 ACK=80

                      Usertypes

                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                      back lsquoCrsquo

                      host ACKsreceipt

                      of echoedlsquoCrsquo

                      timesimple telnet scenario

                      3 Transport Layer 63Comp 361 Spring 2005

                      TCP Round Trip Time and Timeout

                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                      average several recent measurements not just current SampleRTT

                      Q how to set TCP timeout valuelonger than RTT

                      but RTT variestoo short premature timeout

                      unnecessary retransmissions

                      too long slow reaction to segment loss

                      3 Transport Layer 64Comp 361 Spring 2005

                      TCP Round Trip Time and Timeout

                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                      3 Transport Layer 65Comp 361 Spring 2005

                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                      100

                      150

                      200

                      250

                      300

                      350

                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                      time (seconnds)

                      RTT

                      (mill

                      iseco

                      nds)

                      SampleRTT Estimated RTT

                      3 Transport Layer 66Comp 361 Spring 2005

                      TCP Round Trip Time and Timeout

                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                      (typically β = 025)

                      Then set timeout interval

                      TimeoutInterval = EstimatedRTT + 4DevRTT

                      3 Transport Layer 67Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 68Comp 361 Spring 2005

                      TCP reliable data transfer

                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                      Retransmissions are triggered by

                      timeout eventsduplicate acks

                      Initially consider simplified TCP sender

                      ignore duplicate acksignore flow control congestion control

                      3 Transport Layer 69Comp 361 Spring 2005

                      TCP sender eventsdata rcvd from app

                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                      timeoutretransmit segment that caused timeoutrestart timer

                      Ack rcvdIf acknowledges previously unackedsegments

                      update what is known to be ackedstart timer if there are outstanding segments

                      TCP sender(simplified)

                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                      loop (forever) switch(event)

                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                      event timer timeoutretransmit not-yet-acknowledged segment with

                      smallest sequence numberstart timer

                      event ACK received with ACK field value of y if (y gt SendBase)

                      SendBase = yif (there are currently not-yet-acknowledged segments)

                      start timer

                      end of loop forever

                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                      3 Transport Layer 70Comp 361 Spring 2005

                      3 Transport Layer 71Comp 361 Spring 2005

                      TCP retransmission scenariosHost A

                      Seq=100 20 bytes data

                      ACK=100

                      timepremature timeout

                      Host B

                      Seq=92 8 bytes data

                      ACK=120

                      Seq=92 8 bytes data

                      Seq=

                      92 t

                      imeo

                      ut

                      ACK=120

                      Host A

                      Seq=92 8 bytes data

                      ACK=100

                      loss

                      tim

                      eout

                      lost ACK scenario

                      Host B

                      X

                      Seq=92 8 bytes data

                      ACK=100

                      time

                      SendBase= 120

                      SendBase= 120

                      Sendbase= 100

                      Seq=

                      92 t

                      imeo

                      utSendBase

                      = 100

                      3 Transport Layer 72Comp 361 Spring 2005

                      TCP retransmission scenarios (more)Host A

                      Seq=92 8 bytes data

                      ACK=100

                      loss

                      tim

                      eout

                      Cumulative ACK scenario

                      Host B

                      X

                      Seq=100 20 bytes data

                      ACK=120

                      time

                      SendBase= 120

                      3 Transport Layer 73Comp 361 Spring 2005

                      TCP ACK generation [RFC 1122 RFC 2581]

                      Event at Receiver

                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                      Arrival of segment that partially or completely fills gap

                      TCP Receiver action

                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                      Immediately send single cumulative ACK ACKing both in-order segments

                      Immediately send duplicate ACK indicating seq of next expected byte

                      Immediate send ACK provided thatsegment starts at lower end of gap

                      3 Transport Layer 74Comp 361 Spring 2005

                      More on Sender Policies

                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                      3 Transport Layer 75Comp 361 Spring 2005

                      Fast Retransmit

                      Time-out period often relatively long

                      long delay before resending lost packet

                      Detect lost segments via duplicate ACKs

                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                      fast retransmit resend segment before timer expires

                      3 Transport Layer 76Comp 361 Spring 2005

                      Fast retransmit algorithm

                      event ACK received with ACK field value of y if (y gt SendBase)

                      SendBase = yif (there are currently not-yet-acknowledged segments)

                      start timer

                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                      resend segment with sequence number y

                      a duplicate ACK for already ACKed segment

                      fast retransmit

                      3 Transport Layer 77Comp 361 Spring 2005

                      TCP GBN or Selective Repeat

                      Basic TCP looks a lot like GBN

                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                      This looks a lot like Selective Repeat

                      TCP is a hybrid

                      3 Transport Layer 78Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 79Comp 361 Spring 2005

                      TCP Flow Control

                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                      3 Transport Layer 80Comp 361 Spring 2005

                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                      transmitting too muchtoo fast

                      flow controlreceive side of TCP connection has a receive buffer

                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                      app process may be slow at reading from buffer

                      3 Transport Layer 81Comp 361 Spring 2005

                      TCP segment structure

                      source port dest port

                      32 bits

                      applicationdata

                      (variable length)

                      sequence numberacknowledgement number

                      Receive windowUrg data pnterchecksum

                      FSRPAUheadlen

                      notused

                      Options (variable length)

                      URG urgent data (generally not used)

                      ACK ACK valid

                      PSH push data now(generally not used)

                      RST SYN FINconnection estab(setup teardown

                      commands)

                      bytes rcvr willingto accept

                      Internetchecksum

                      (as in UDP)

                      countingby bytes of data(not segments)

                      3 Transport Layer 82Comp 361 Spring 2005

                      TCP Flow control how it works

                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                      LastByteRead]

                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                      guarantees receive buffer doesnrsquot overflow

                      3 Transport Layer 83Comp 361 Spring 2005

                      Technical Issue

                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                      3 Transport Layer 84Comp 361 Spring 2005

                      Note on UDP

                      UDP has no flow control

                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                      3 Transport Layer 85Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 86Comp 361 Spring 2005

                      TCP Connection Management

                      Three way handshakeStep 1 client end system sends

                      TCP SYN control segment to server

                      specifies client_isn the initial seq No application data

                      Step 2 server end system receives SYN replies with SYNACK control segment

                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                      seq sbuffers flow control info (eg RcvWindow)

                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                      3 Transport Layer 87Comp 361 Spring 2005

                      TCP Connection Management (cont)

                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                      Allocate buffersAllocates buffersCan include application data

                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                      clientConnection request (SYN=1 seq=client_isn)

                      server

                      Connection granted (SYN=1 server_isn

                      ACK (SYN=0 seq=client_isn+1)

                      ack=client_isn+1)

                      ack=server_isn+1

                      3 Transport Layer 88Comp 361 Spring 2005

                      TCP Connection Management (cont)

                      Closing a connection

                      client closes socketclientSocketclose()

                      Step 1 client end system sends TCP FIN control segment to server

                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                      client

                      FIN

                      server

                      ACK

                      ACK

                      FIN

                      close

                      close

                      closed

                      tim

                      ed w

                      ait

                      3 Transport Layer 89Comp 361 Spring 2005

                      TCP Connection Management (cont)

                      Step 3 client receives FIN replies with ACK

                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                      Closes down after timed-wait

                      Step 4 server receives ACK Connection closed

                      Note with small modification can handle simultaneous FINs

                      client

                      FIN

                      server

                      ACK

                      ACK

                      FIN

                      closing

                      closing

                      closed

                      tim

                      ed w

                      ait

                      closed

                      3 Transport Layer 90Comp 361 Spring 2005

                      TCP Connection Management (cont)

                      ExampleTCP serverlifecycle

                      Example TCP clientlifecycle

                      3 Transport Layer 91Comp 361 Spring 2005

                      A few special cases

                      Have not discussed what happens if both client and server decide to close down connection at same time

                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                      3 Transport Layer 92Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 93Comp 361 Spring 2005

                      Principles of Congestion Control

                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                      a top-10 problem

                      3 Transport Layer 94Comp 361 Spring 2005

                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                      large delays when congestedmaximum achievable throughput

                      3 Transport Layer 95Comp 361 Spring 2005

                      Causescosts of congestion scenario 2

                      one router finite buffers sender retransmission of lost packet

                      3 Transport Layer 96Comp 361 Spring 2005

                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                      λin λout=

                      λin λoutgtλ

                      inλout

                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                      (c)(a) (b)

                      3 Transport Layer 97Comp 361 Spring 2005

                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                      λin

                      Q what happens as and increase λ

                      in

                      3 Transport Layer 98Comp 361 Spring 2005

                      Causescosts of congestion scenario 3

                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                      3 Transport Layer 99Comp 361 Spring 2005

                      Approaches towards congestion control

                      Two broad approaches towards congestion control

                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                      Network-assisted congestion controlrouters provide feedback to end systems

                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                      3 Transport Layer 100Comp 361 Spring 2005

                      Case study ATM ABR congestion control

                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                      RM cells returned to sender by receiver with bits intact

                      small exception ndash see next page

                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                      sender should use available bandwidth

                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                      3 Transport Layer 101Comp 361 Spring 2005

                      Case study ATM ABR congestion control

                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                      3 Transport Layer 102Comp 361 Spring 2005

                      Chapter 3 outline

                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                      35 Connection-oriented transport TCP

                      segment structurereliable data transferflow controlconnection management

                      36 Principles of congestion control37 TCP congestion control

                      3 Transport Layer 103Comp 361 Spring 2005

                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                      Congwin

                      w segments each with MSS bytes sent in one RTT

                      throughput = w MSSRTT Bytessec

                      3 Transport Layer 104Comp 361 Spring 2005

                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                      LastByteSent-LastByteAcked le CongWin

                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                      3 Transport Layer 105Comp 361 Spring 2005

                      TCP AIMDmultiplicative decrease additive increase increase

                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                      cut CongWin in half after loss event

                      8 Kbytes

                      16 Kbytes

                      24 Kbytes

                      time

                      congestionwindow

                      Long-lived TCP connection

                      3 Transport Layer 106Comp 361 Spring 2005

                      TCP Slow Start

                      When connection begins CongWin = 1 MSS

                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                      available bandwidth may be gtgt MSSRTT

                      desirable to quickly ramp up to respectable rate

                      When connection begins increase rate exponentially fast until first loss event

                      3 Transport Layer 107Comp 361 Spring 2005

                      TCP Slow Start (more)

                      When connection begins increase rate exponentially until first loss event

                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                      Summary initial rate is slow but ramps up exponentially fast

                      Host A

                      one segment

                      RTT

                      Host B

                      time

                      two segments

                      four segments

                      3 Transport Layer 108Comp 361 Spring 2005

                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                      3 Transport Layer 109Comp 361 Spring 2005

                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                      3 Transport Layer 110Comp 361 Spring 2005

                      Summary TCP Congestion Control

                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                      3 Transport Layer 111Comp 361 Spring 2005

                      The Big Picture

                      3 Transport Layer 112Comp 361 Spring 2005

                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                      ACK receipt for previously unackeddata

                      Slow Start (SS)

                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                      set state to ldquoCongestion Avoidancerdquo

                      Resulting in a doubling of CongWin every RTT

                      ACK receipt for previously unackeddata

                      CongestionAvoidance (CA)

                      CongWin = CongWin+MSS (MSSCongWin)

                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                      Loss event detected by triple duplicate ACK

                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                      Enter slow start

                      Duplicate ACK

                      SS or CA Increment duplicate ACK count for segment being acked

                      CongWin and Threshold not changed

                      3 Transport Layer 113Comp 361 Spring 2005

                      TCP throughput

                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                      3 Transport Layer 114Comp 361 Spring 2005

                      TCP Futures

                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                      L = 210-10 WowNew versions of TCP for high-speed needed

                      LRTTMSSsdot221

                      3 Transport Layer 115Comp 361 Spring 2005

                      TCP FairnessFairness goal if K TCP sessions share same

                      bottleneck link of bandwidth R each should have average rate of RK

                      TCP connection 1

                      bottleneckrouter

                      capacity R

                      TCP connection 2

                      3 Transport Layer 116Comp 361 Spring 2005

                      Why is TCP fairTwo competing sessions

                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                      R

                      R

                      equal bandwidth share

                      Connection 1 throughput

                      Conn

                      ecti

                      on 2

                      thr

                      ough

                      p ut

                      congestion avoidance additive increaseloss decrease window by factor of 2

                      congestion avoidance additive increaseloss decrease window by factor of 2

                      3 Transport Layer 117Comp 361 Spring 2005

                      Fairness (more)Fairness and UDP

                      Multimedia apps often do not use TCP

                      do not want rate throttled by congestion control

                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                      Current Research area How to keep UDP from congesting the internet

                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                      3 Transport Layer 118Comp 361 Spring 2005

                      TCP Latency ModelingNotation assumptions

                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                      modeling slow start

                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                      3 Transport Layer 119Comp 361 Spring 2005

                      Fixed Congestion Window (W)Two cases

                      1 WSR gt RTT + SR ACK for first segment in window returns before

                      windowrsquos worth of data sentLatency = 2RTT + OR

                      2 WSR lt RTT + SR ACK for first segment in window returns after

                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                      3 Transport Layer 120Comp 361 Spring 2005

                      Fixed congestion window (1)

                      First caseWSR gt RTT + SR ACK for

                      first segment in window returns before windowrsquos worth of data sent

                      latency = 2RTT + OR

                      3 Transport Layer 121Comp 361 Spring 2005

                      Fixed congestion window (2)

                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                      3 Transport Layer 122Comp 361 Spring 2005

                      TCP Latency Modeling Slow Start (1)

                      Now suppose window grows according to slow start(with no threshold and no loss events)

                      Will show that the delay for one object is

                      RS

                      RSRTTP

                      RORTTLatency P )12(2 minusminus⎥⎦

                      ⎤⎢⎣⎡ +++=

                      where P is the number of times TCP idles at server1min minus= KQP

                      - where Q is the number of times the server idlesif the object were of infinite size

                      - and K is the number of windows that cover the object

                      3 Transport Layer 123Comp 361 Spring 2005

                      TCP Latency Modeling Slow Start (2)

                      RTT

                      initiate TCPconnection

                      requestobject

                      first window= SR

                      second window= 2SR

                      third window= 4SR

                      fourth window= 8SR

                      completetransmissionobject

                      delivered

                      time atclient

                      time atserver

                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                      Server idles P=2 times

                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                      Server idles P = minK-1Q times

                      3 Transport Layer 124Comp 361 Spring 2005

                      TCP Latency Modeling (3)

                      ementacknowledg receivesserver until

                      segment send tostartsserver whenfrom time=+ RTTRS

                      RS

                      RSRTTPRTT

                      RO

                      RSRTT

                      RSRTT

                      RO

                      idleTimeRTTRO

                      P

                      kP

                      k

                      P

                      pp

                      )12(][2

                      ]2[2

                      2delay

                      1

                      1

                      1

                      minusminus+++=

                      minus+++=

                      ++=

                      minus

                      =

                      =

                      sum

                      sum

                      th window after the timeidle 2 1 kRSRTT

                      RS k =⎥⎦

                      ⎤⎢⎣⎡ minus+

                      +minus

                      window kth the transmit totime2 1 =minus

                      RSk

                      RTT

                      initiate TCPconnection

                      requestobject

                      first window= SR

                      second window= 2SR

                      third window= 4SR

                      fourth window= 8SR

                      completetransmissionobject

                      delivered

                      time atclient

                      time atserver

                      3 Transport Layer 125Comp 361 Spring 2005

                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                      How do we calculate K

                      ⎥⎥⎤

                      ⎢⎢⎡ +=

                      +ge=

                      geminus=

                      ge+++=

                      ge+++=minus

                      minus

                      )1(log

                      )1(logmin

                      12min

                      222min222min

                      2

                      2

                      110

                      110

                      SO

                      SOkk

                      SOk

                      SOkOSSSkK

                      k

                      k

                      k

                      L

                      L

                      Calculation of Q number of idles for infinite-size objectis similar

                      3 Transport Layer 126Comp 361 Spring 2005

                      HTTP ModelingAssume Web page consists of

                      1 base HTML page (of size O bits)M images (each of size O bits)

                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                      3 Transport Layer 127Comp 361 Spring 2005

                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                      02468

                      101214161820

                      28Kbps

                      100Kbps

                      1 Mbps 10Mbps

                      non-persistent

                      persistent

                      parallel non-persistent

                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                      3 Transport Layer 128Comp 361 Spring 2005

                      HTTP Response time (in seconds)

                      0

                      10

                      20

                      30

                      40

                      50

                      60

                      70

                      28Kbps

                      100Kbps

                      1 Mbps 10Mbps

                      non-persistent

                      persistent

                      parallel non-persistent

                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                      3 Transport Layer 129Comp 361 Spring 2005

                      Chapter 3 Summaryprinciples behind transport layer services

                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                      instantiation and implementation in the Internet

                      UDPTCP

                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                      • Chapter 3 Transport Layer last revised 160305
                      • Chapter 3 outline
                      • Transport services and protocols
                      • Transport vs network layer
                      • Transport-layer protocols
                      • Chapter 3 outline
                      • Multiplexingdemultiplexing
                      • Multiplexingdemultiplexing
                      • How demultiplexing works
                      • Connectionless demultiplexing
                      • Connectionless demux (cont)
                      • Connection-oriented demux
                      • Connection-oriented demux (cont)
                      • Connection-oriented demux Threaded Web Server
                      • Chapter 3 outline
                      • UDP User Datagram Protocol [RFC 768]
                      • UDP more
                      • UDP checksum
                      • Chapter 3 outline
                      • Principles of Reliable data transfer
                      • Reliable data transfer getting started
                      • Reliable data transfer getting started
                      • Incremental Improvements
                      • Rdt10 reliable transfer over a reliable channel
                      • Rdt20 channel with bit errors
                      • rdt20 FSM specification
                      • rdt20 operation with no errors
                      • rdt20 error scenario
                      • rdt20 has a fatal flaw
                      • rdt21 sender handles garbled ACKNAKs
                      • rdt21 receiver handles garbled ACKNAKs
                      • rdt21 discussion
                      • rdt22 a NAK-free protocol
                      • rdt22 sender receiver fragments
                      • rdt30 channels with errors and loss
                      • rdt30 sender
                      • rdt30 in action
                      • rdt30 in action
                      • Performance of rdt30
                      • rdt30 stop-and-wait operation
                      • Pipelined protocols
                      • Pipelined protocols
                      • Pipelining increased utilization
                      • Go-Back-N
                      • GBN Sender
                      • GBN sender extended FSM
                      • GBN receiver extended FSM
                      • More on receiver
                      • GBN inaction
                      • Selective Repeat
                      • Selective repeat sender receiver windows
                      • Selective repeat
                      • Selective repeat in action
                      • Selective repeat dilemma
                      • Chapter 3 outline
                      • TCP Overview RFCs 793 1122 1323 2018 2581
                      • More TCP Details
                      • Even More TCP Details
                      • TCP segment structure
                      • TCP seq rsquos and ACKs
                      • TCP Round Trip Time and Timeout
                      • TCP Round Trip Time and Timeout
                      • Example RTT estimation
                      • TCP Round Trip Time and Timeout
                      • Chapter 3 outline
                      • TCP reliable data transfer
                      • TCP sender events
                      • TCP sender(simplified)
                      • TCP retransmission scenarios
                      • TCP retransmission scenarios (more)
                      • TCP ACK generation [RFC 1122 RFC 2581]
                      • More on Sender Policies
                      • Fast Retransmit
                      • Fast retransmit algorithm
                      • TCP GBN or Selective Repeat
                      • Chapter 3 outline
                      • TCP Flow Control
                      • TCP Flow Control
                      • TCP segment structure
                      • TCP Flow control how it works
                      • Technical Issue
                      • Chapter 3 outline
                      • TCP Connection Management
                      • TCP Connection Management (cont)
                      • TCP Connection Management (cont)
                      • TCP Connection Management (cont)
                      • TCP Connection Management (cont)
                      • A few special cases
                      • Chapter 3 outline
                      • Principles of Congestion Control
                      • Causescosts of congestion scenario 1
                      • Causescosts of congestion scenario 2
                      • Causescosts of congestion scenario 3
                      • Causescosts of congestion scenario 3
                      • Approaches towards congestion control
                      • Case study ATM ABR congestion control
                      • Case study ATM ABR congestion control
                      • Chapter 3 outline
                      • TCP Congestion Control
                      • TCP AIMD
                      • TCP Slow Start
                      • TCP Slow Start (more)
                      • Summary TCP Congestion Control
                      • The Big Picture
                      • TCP sender congestion control
                      • TCP throughput
                      • TCP Futures
                      • TCP Fairness
                      • Why is TCP fair
                      • Fairness (more)
                      • TCP Latency Modeling
                      • Fixed Congestion Window (W)
                      • Fixed congestion window (1)
                      • Fixed congestion window (2)
                      • TCP Latency Modeling Slow Start (1)
                      • TCP Latency Modeling Slow Start (2)
                      • TCP Latency Modeling (3)
                      • TCP Latency Modeling (4)
                      • HTTP Modeling
                      • Chapter 3 Summary

                        3 Transport Layer 12Comp 361 Spring 2005

                        Connection-oriented demux

                        TCP socket identified by 4-tuple

                        source IP addresssource port numberdest IP addressdest port number

                        recv host uses all four values to direct segment to appropriate socket

                        Server host may support many simultaneous TCP sockets

                        each socket identified by its own 4-tuple

                        Web servers have different sockets for each connecting client

                        non-persistent HTTP will have different socket for each request

                        3 Transport Layer 13Comp 361 Spring 2005

                        Connection-oriented demux(cont)

                        ClientIPB

                        P3

                        clientIP A

                        P1P1P3

                        serverIP C

                        SP 80DP 9157

                        SP 9157DP 80

                        SP 80DP 5775

                        SP 5775DP 80

                        P4

                        3 Transport Layer 14Comp 361 Spring 2005

                        Connection-oriented demux Threaded Web Server

                        ClientIPB

                        P1

                        clientIP A

                        P1P2

                        serverIP C

                        SP 9157DP 80

                        SP 9157DP 80

                        P4 P3

                        D-IPCS-IP AD-IPC

                        S-IP B

                        SP 5775DP 80

                        D-IPCS-IP B

                        3 Transport Layer 15Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 16Comp 361 Spring 2005

                        UDP User Datagram Protocol [RFC 768]

                        ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                        lostdelivered out of order to app

                        connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                        Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                        3 Transport Layer 17Comp 361 Spring 2005

                        UDP moreoften used for streaming multimedia apps

                        loss tolerantrate sensitive

                        other UDP uses (why)

                        DNS small delaySNMP stressful cond

                        reliable transfer over UDP add reliability at application layer

                        application-specific error recover

                        source port dest port

                        32 bits

                        Applicationdata

                        (message)

                        length checksumLength in

                        bytes of UDPsegmentincluding

                        header

                        UDP segment format

                        3 Transport Layer 18Comp 361 Spring 2005

                        UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                        segment

                        Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                        NO - error detectedYES - no error detected But maybe errors nonetheless More later

                        Receiver may choose to discard segment or send a warning to app in case error

                        Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                        3 Transport Layer 19Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 20Comp 361 Spring 2005

                        Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                        characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                        3 Transport Layer 21Comp 361 Spring 2005

                        Reliable data transfer getting started

                        sendside

                        receiveside

                        rdt_send() called from above (eg by app) Passed data to

                        deliver to receiver upper layer

                        udt_send() called by rdtto transfer packet over

                        unreliable channel to receiver

                        rdt_rcv() called when packet arrives on rcv-side of channel

                        deliver_data() called by rdt to deliver data to upper

                        3 Transport Layer 22Comp 361 Spring 2005

                        Reliable data transfer getting startedWersquoll

                        incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                        but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                        state1

                        state2

                        event causing state transitionactions taken on state transition

                        state when in this ldquostaterdquo next state

                        uniquely determined by next event

                        eventactions

                        3 Transport Layer 23Comp 361 Spring 2005

                        Incremental Improvements

                        rdt10 assumes every packet sent arrives and no errors introduced in transmission

                        rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                        rdt21 deals with corrupted ACKSNAKS

                        rdt22 like rdt21 but does not need NAKs

                        Rdt30 Allows packets to be lost

                        Rdt10 reliable transfer over a reliable channel

                        underlying channel perfectly reliableno bit errorsno loss of packets

                        separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                        Wait for call from above packet = make_pkt(data)

                        udt_send(packet)

                        rdt_send(data)extract (packetdata)deliver_data(data)

                        Wait for call from

                        below

                        rdt_rcv(packet)

                        sender receiver

                        3 Transport Layer 24Comp 361 Spring 2005

                        3 Transport Layer 25Comp 361 Spring 2005

                        Rdt20 channel with bit errors

                        underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                        the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                        new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                        3 Transport Layer 26Comp 361 Spring 2005

                        rdt20 FSM specification

                        Wait for call from above

                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                        udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                        udt_send(NAK)

                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                        Wait for ACK or

                        NAK

                        rdt_send(data)

                        receiver

                        Wait for call from

                        below

                        Λ

                        sender

                        3 Transport Layer 27Comp 361 Spring 2005

                        rdt20 operation with no errors

                        Wait for call from above

                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                        udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                        udt_send(NAK)

                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                        Wait for ACK or

                        NAK

                        Wait for call from

                        below

                        rdt_send(data)

                        Λ

                        3 Transport Layer 28Comp 361 Spring 2005

                        rdt20 error scenario

                        Wait for call from above

                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                        udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                        udt_send(NAK)

                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                        Wait for ACK or

                        NAK

                        Wait for call from

                        below

                        rdt_send(data)

                        Λ

                        3 Transport Layer 29Comp 361 Spring 2005

                        rdt20 has a fatal flawWhat happens if ACKNAK

                        corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                        What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                        Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                        Sender sends one packet then waits for receiver response

                        stop and wait

                        3 Transport Layer 30Comp 361 Spring 2005

                        Sender whenever sender receives control message it sends a packet to receiver

                        A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                        Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                        Note ACKNAK do not contain sequence

                        3 Transport Layer 31Comp 361 Spring 2005

                        rdt21 sender handles garbled ACKNAKs

                        Wait for call 0 from

                        above

                        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                        rdt_send(data)

                        Wait for ACK or NAK 0 udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                        rdt_send(data)

                        udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                        Wait forcall 1 from

                        above

                        Wait for ACK or NAK 1

                        ΛΛ

                        3 Transport Layer 32Comp 361 Spring 2005

                        rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                        ampamp has_seq0(rcvpkt)

                        Wait for 0 from below

                        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                        Wait for 1 from below

                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                        3 Transport Layer 33Comp 361 Spring 2005

                        rdt21 discussion

                        Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                        state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                        Receivermust check if received packet is duplicate

                        state indicates whether 0 or 1 is expected pkt seq

                        note receiver can notknow if its last ACKNAK received OK at sender

                        3 Transport Layer 34Comp 361 Spring 2005

                        rdt22 a NAK-free protocol

                        same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                        receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                        duplicate ACK at sender results in same action as NAK retransmit current pkt

                        3 Transport Layer 35Comp 361 Spring 2005

                        rdt22 sender receiver fragments

                        Wait for call 0 from

                        above

                        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                        rdt_send(data)

                        udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                        isACK(rcvpkt1) )

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                        Wait for ACK

                        0sender FSM

                        fragment

                        Wait for 0 from below

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                        has_seq1(rcvpkt))

                        udt_send(sndpkt)receiver FSM

                        fragment

                        Λ

                        3 Transport Layer 36Comp 361 Spring 2005

                        rdt30 channels with errors and loss

                        New assumptionunderlying channel can also lose packets (data or ACKs)

                        checksum seq ACKs retransmissions will be of help but not enough

                        Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                        Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                        retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                        requires countdown timer

                        3 Transport Layer 37Comp 361 Spring 2005

                        rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                        rdt_send(data)

                        Wait for

                        ACK0

                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                        Wait for call 1 from

                        above

                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                        rdt_send(data)

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                        stop_timerstop_timer

                        udt_send(sndpkt)start_timer

                        timeout

                        udt_send(sndpkt)start_timer

                        timeout

                        rdt_rcv(rcvpkt)

                        Wait for call 0from

                        above

                        Wait for

                        ACK1

                        Λrdt_rcv(rcvpkt)

                        ΛΛ

                        Λ

                        3 Transport Layer 38Comp 361 Spring 2005

                        rdt30 in action

                        3 Transport Layer 39Comp 361 Spring 2005

                        rdt30 in action

                        3 Transport Layer 40Comp 361 Spring 2005

                        Performance of rdt30

                        rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                        L (packet length in bits)R (transmission rate bps)

                        8kbpkt109 bsec

                        Ttransmit = = = 8 microsec

                        U sender =

                        00830008

                        = 000027 L R RTT + L R

                        =

                        U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                        rdt30 stop-and-wait operation

                        first packet bit transmitted t = 0

                        sender receiver

                        RTT

                        last packet bit transmitted t = L R

                        first packet bit arriveslast packet bit arrives send ACK

                        ACK arrives send next packet t = RTT + L R

                        U sender =

                        008 30008

                        = 000027 L R RTT + L R

                        =

                        3 Transport Layer 41Comp 361 Spring 2005

                        3 Transport Layer 42Comp 361 Spring 2005

                        Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                        range of sequence numbers must be increasedbuffering at sender andor receiver

                        3 Transport Layer 43Comp 361 Spring 2005

                        Pipelined protocols

                        Advantage much better bandwidth utilization than stop-and-wait

                        Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                        Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                        Note TCP is not exactly either

                        Pipelining increased utilization

                        first packet bit transmitted t = 0

                        sender receiver

                        RTT

                        last bit transmitted t = L R

                        first packet bit arriveslast packet bit arrives send ACK

                        ACK arrives send next packet t = RTT + L R

                        last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                        U sender =

                        02430008

                        = 00008 3 L R RTT + L R

                        =

                        Increase utilizationby a factor of 3

                        3 Transport Layer 44Comp 361 Spring 2005

                        3 Transport Layer 45Comp 361 Spring 2005

                        Go-Back-NSender

                        k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                        ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                        Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                        3 Transport Layer 46Comp 361 Spring 2005

                        GBN Sender

                        rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                        Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                        Timeout resends ALL packets that have been sent but not yet acknowledged

                        This is only event that triggers resend

                        3 Transport Layer 47Comp 361 Spring 2005

                        GBN sender extended FSMrdt_send(data)

                        Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                        timeout

                        if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                        start_timernextseqnum++

                        elserefuse_data(data)

                        base = getacknum(rcvpkt)+1If (base == nextseqnum)

                        stop_timerelse

                        start_timer

                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                        base=1nextseqnum=1

                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                        Λ

                        3 Transport Layer 48Comp 361 Spring 2005

                        GBN receiver extended FSM

                        Wait

                        udt_send(sndpkt)default

                        rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                        expectedseqnum=1sndpkt =

                        make_pkt(0ACKchksum)

                        Λ

                        If expected packet receivedSend ACK and deliver packet upstairs

                        If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                        3 Transport Layer 49Comp 361 Spring 2005

                        More on receiver

                        The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                        3 Transport Layer 50Comp 361 Spring 2005

                        GBN inaction

                        GBN is easy to code but might have performance problems

                        In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                        Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                        3 Transport Layer 51Comp 361 Spring 2005

                        3 Transport Layer 52Comp 361 Spring 2005

                        Selective Repeat

                        receiver individually acknowledges all correctly received pkts

                        buffers pkts as needed for eventual in-order delivery to upper layer

                        sender only resends pkts for which ACK not received

                        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                        3 Transport Layer 53Comp 361 Spring 2005

                        Selective repeat sender receiver windows

                        3 Transport Layer 54Comp 361 Spring 2005

                        Selective repeat

                        pkt n in [rcvbase rcvbase+N-1]

                        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                        pkt n in [rcvbase-Nrcvbase-1]

                        ACK(n) (note this is a reACK)

                        otherwiseignore

                        receiverdata from above

                        if next available seq in window send pkt

                        timeout(n)resend pkt n restart timer

                        ACK(n) in [sendbasesendbase+N]

                        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                        sender

                        3 Transport Layer 55Comp 361 Spring 2005

                        Selective repeat in action

                        3 Transport Layer 56Comp 361 Spring 2005

                        Selective repeatdilemma

                        Example seq rsquos 0 1 2 3window size=3

                        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                        Q what is relationship between seq size and window size

                        3 Transport Layer 57Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 58Comp 361 Spring 2005

                        TCP Overview RFCs 793 1122 1323 2018 2581

                        full duplex databi-directional data flow in same connectionMSS maximum segment size

                        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                        flow controlledsender will not overwhelm receiver

                        point-to-pointone sender one receiver

                        reliable in-order byte steam

                        no ldquomessage boundariesrdquopipelined

                        TCP congestion and flow control set window size

                        send amp receive buffers

                        socketdoor

                        TCPsend buffer

                        TCPreceive buffer

                        socketdoor

                        segment

                        applicationwrites data

                        applicationreads data

                        3 Transport Layer 59Comp 361 Spring 2005

                        More TCP DetailsMaximum Segment Size (MSS)

                        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                        Application Data + TCP Header = TCP Segment

                        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                        (again no payload)Client responds with third special segment

                        This can contain payload

                        3 Transport Layer 60Comp 361 Spring 2005

                        Even More TCP Details

                        A TCP connection between client and server creates in both client and server

                        (i) buffers(ii) variables and

                        (iii) a socket connection to process

                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                        any of the network elements between the host and server

                        3 Transport Layer 61Comp 361 Spring 2005

                        TCP segment structure

                        source port dest port

                        32 bits

                        applicationdata

                        (variable length)

                        sequence numberacknowledgement number

                        Receive windowUrg data pnterchecksum

                        FSRPAUheadlen

                        notused

                        Options (variable length)

                        URG urgent data (generally not used)

                        ACK ACK valid

                        PSH push data now(generally not used)

                        RST SYN FINconnection estab(setup teardown

                        commands)

                        bytes rcvr willingto accept

                        Internetchecksum

                        (as in UDP)

                        countingby bytes of data(not segments)

                        3 Transport Layer 62Comp 361 Spring 2005

                        TCP seq rsquos and ACKsSeq rsquos

                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                        ACKsseq of next byte expected from other sidecumulative ACK

                        Q how receiver handles out-of-order segments

                        A TCP spec doesnrsquot say - up to implementer

                        Host BHost A

                        Seq=42 ACK=79 data = lsquoCrsquo

                        Seq=79 ACK=43 data = lsquoCrsquo

                        Seq=43 ACK=80

                        Usertypes

                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                        back lsquoCrsquo

                        host ACKsreceipt

                        of echoedlsquoCrsquo

                        timesimple telnet scenario

                        3 Transport Layer 63Comp 361 Spring 2005

                        TCP Round Trip Time and Timeout

                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                        average several recent measurements not just current SampleRTT

                        Q how to set TCP timeout valuelonger than RTT

                        but RTT variestoo short premature timeout

                        unnecessary retransmissions

                        too long slow reaction to segment loss

                        3 Transport Layer 64Comp 361 Spring 2005

                        TCP Round Trip Time and Timeout

                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                        3 Transport Layer 65Comp 361 Spring 2005

                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                        100

                        150

                        200

                        250

                        300

                        350

                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                        time (seconnds)

                        RTT

                        (mill

                        iseco

                        nds)

                        SampleRTT Estimated RTT

                        3 Transport Layer 66Comp 361 Spring 2005

                        TCP Round Trip Time and Timeout

                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                        (typically β = 025)

                        Then set timeout interval

                        TimeoutInterval = EstimatedRTT + 4DevRTT

                        3 Transport Layer 67Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 68Comp 361 Spring 2005

                        TCP reliable data transfer

                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                        Retransmissions are triggered by

                        timeout eventsduplicate acks

                        Initially consider simplified TCP sender

                        ignore duplicate acksignore flow control congestion control

                        3 Transport Layer 69Comp 361 Spring 2005

                        TCP sender eventsdata rcvd from app

                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                        timeoutretransmit segment that caused timeoutrestart timer

                        Ack rcvdIf acknowledges previously unackedsegments

                        update what is known to be ackedstart timer if there are outstanding segments

                        TCP sender(simplified)

                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                        loop (forever) switch(event)

                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                        event timer timeoutretransmit not-yet-acknowledged segment with

                        smallest sequence numberstart timer

                        event ACK received with ACK field value of y if (y gt SendBase)

                        SendBase = yif (there are currently not-yet-acknowledged segments)

                        start timer

                        end of loop forever

                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                        3 Transport Layer 70Comp 361 Spring 2005

                        3 Transport Layer 71Comp 361 Spring 2005

                        TCP retransmission scenariosHost A

                        Seq=100 20 bytes data

                        ACK=100

                        timepremature timeout

                        Host B

                        Seq=92 8 bytes data

                        ACK=120

                        Seq=92 8 bytes data

                        Seq=

                        92 t

                        imeo

                        ut

                        ACK=120

                        Host A

                        Seq=92 8 bytes data

                        ACK=100

                        loss

                        tim

                        eout

                        lost ACK scenario

                        Host B

                        X

                        Seq=92 8 bytes data

                        ACK=100

                        time

                        SendBase= 120

                        SendBase= 120

                        Sendbase= 100

                        Seq=

                        92 t

                        imeo

                        utSendBase

                        = 100

                        3 Transport Layer 72Comp 361 Spring 2005

                        TCP retransmission scenarios (more)Host A

                        Seq=92 8 bytes data

                        ACK=100

                        loss

                        tim

                        eout

                        Cumulative ACK scenario

                        Host B

                        X

                        Seq=100 20 bytes data

                        ACK=120

                        time

                        SendBase= 120

                        3 Transport Layer 73Comp 361 Spring 2005

                        TCP ACK generation [RFC 1122 RFC 2581]

                        Event at Receiver

                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                        Arrival of segment that partially or completely fills gap

                        TCP Receiver action

                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                        Immediately send single cumulative ACK ACKing both in-order segments

                        Immediately send duplicate ACK indicating seq of next expected byte

                        Immediate send ACK provided thatsegment starts at lower end of gap

                        3 Transport Layer 74Comp 361 Spring 2005

                        More on Sender Policies

                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                        3 Transport Layer 75Comp 361 Spring 2005

                        Fast Retransmit

                        Time-out period often relatively long

                        long delay before resending lost packet

                        Detect lost segments via duplicate ACKs

                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                        fast retransmit resend segment before timer expires

                        3 Transport Layer 76Comp 361 Spring 2005

                        Fast retransmit algorithm

                        event ACK received with ACK field value of y if (y gt SendBase)

                        SendBase = yif (there are currently not-yet-acknowledged segments)

                        start timer

                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                        resend segment with sequence number y

                        a duplicate ACK for already ACKed segment

                        fast retransmit

                        3 Transport Layer 77Comp 361 Spring 2005

                        TCP GBN or Selective Repeat

                        Basic TCP looks a lot like GBN

                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                        This looks a lot like Selective Repeat

                        TCP is a hybrid

                        3 Transport Layer 78Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 79Comp 361 Spring 2005

                        TCP Flow Control

                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                        3 Transport Layer 80Comp 361 Spring 2005

                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                        transmitting too muchtoo fast

                        flow controlreceive side of TCP connection has a receive buffer

                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                        app process may be slow at reading from buffer

                        3 Transport Layer 81Comp 361 Spring 2005

                        TCP segment structure

                        source port dest port

                        32 bits

                        applicationdata

                        (variable length)

                        sequence numberacknowledgement number

                        Receive windowUrg data pnterchecksum

                        FSRPAUheadlen

                        notused

                        Options (variable length)

                        URG urgent data (generally not used)

                        ACK ACK valid

                        PSH push data now(generally not used)

                        RST SYN FINconnection estab(setup teardown

                        commands)

                        bytes rcvr willingto accept

                        Internetchecksum

                        (as in UDP)

                        countingby bytes of data(not segments)

                        3 Transport Layer 82Comp 361 Spring 2005

                        TCP Flow control how it works

                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                        LastByteRead]

                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                        guarantees receive buffer doesnrsquot overflow

                        3 Transport Layer 83Comp 361 Spring 2005

                        Technical Issue

                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                        3 Transport Layer 84Comp 361 Spring 2005

                        Note on UDP

                        UDP has no flow control

                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                        3 Transport Layer 85Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 86Comp 361 Spring 2005

                        TCP Connection Management

                        Three way handshakeStep 1 client end system sends

                        TCP SYN control segment to server

                        specifies client_isn the initial seq No application data

                        Step 2 server end system receives SYN replies with SYNACK control segment

                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                        seq sbuffers flow control info (eg RcvWindow)

                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                        3 Transport Layer 87Comp 361 Spring 2005

                        TCP Connection Management (cont)

                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                        Allocate buffersAllocates buffersCan include application data

                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                        clientConnection request (SYN=1 seq=client_isn)

                        server

                        Connection granted (SYN=1 server_isn

                        ACK (SYN=0 seq=client_isn+1)

                        ack=client_isn+1)

                        ack=server_isn+1

                        3 Transport Layer 88Comp 361 Spring 2005

                        TCP Connection Management (cont)

                        Closing a connection

                        client closes socketclientSocketclose()

                        Step 1 client end system sends TCP FIN control segment to server

                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                        client

                        FIN

                        server

                        ACK

                        ACK

                        FIN

                        close

                        close

                        closed

                        tim

                        ed w

                        ait

                        3 Transport Layer 89Comp 361 Spring 2005

                        TCP Connection Management (cont)

                        Step 3 client receives FIN replies with ACK

                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                        Closes down after timed-wait

                        Step 4 server receives ACK Connection closed

                        Note with small modification can handle simultaneous FINs

                        client

                        FIN

                        server

                        ACK

                        ACK

                        FIN

                        closing

                        closing

                        closed

                        tim

                        ed w

                        ait

                        closed

                        3 Transport Layer 90Comp 361 Spring 2005

                        TCP Connection Management (cont)

                        ExampleTCP serverlifecycle

                        Example TCP clientlifecycle

                        3 Transport Layer 91Comp 361 Spring 2005

                        A few special cases

                        Have not discussed what happens if both client and server decide to close down connection at same time

                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                        3 Transport Layer 92Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 93Comp 361 Spring 2005

                        Principles of Congestion Control

                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                        a top-10 problem

                        3 Transport Layer 94Comp 361 Spring 2005

                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                        large delays when congestedmaximum achievable throughput

                        3 Transport Layer 95Comp 361 Spring 2005

                        Causescosts of congestion scenario 2

                        one router finite buffers sender retransmission of lost packet

                        3 Transport Layer 96Comp 361 Spring 2005

                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                        λin λout=

                        λin λoutgtλ

                        inλout

                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                        (c)(a) (b)

                        3 Transport Layer 97Comp 361 Spring 2005

                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                        λin

                        Q what happens as and increase λ

                        in

                        3 Transport Layer 98Comp 361 Spring 2005

                        Causescosts of congestion scenario 3

                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                        3 Transport Layer 99Comp 361 Spring 2005

                        Approaches towards congestion control

                        Two broad approaches towards congestion control

                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                        Network-assisted congestion controlrouters provide feedback to end systems

                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                        3 Transport Layer 100Comp 361 Spring 2005

                        Case study ATM ABR congestion control

                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                        RM cells returned to sender by receiver with bits intact

                        small exception ndash see next page

                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                        sender should use available bandwidth

                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                        3 Transport Layer 101Comp 361 Spring 2005

                        Case study ATM ABR congestion control

                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                        3 Transport Layer 102Comp 361 Spring 2005

                        Chapter 3 outline

                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                        35 Connection-oriented transport TCP

                        segment structurereliable data transferflow controlconnection management

                        36 Principles of congestion control37 TCP congestion control

                        3 Transport Layer 103Comp 361 Spring 2005

                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                        Congwin

                        w segments each with MSS bytes sent in one RTT

                        throughput = w MSSRTT Bytessec

                        3 Transport Layer 104Comp 361 Spring 2005

                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                        LastByteSent-LastByteAcked le CongWin

                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                        3 Transport Layer 105Comp 361 Spring 2005

                        TCP AIMDmultiplicative decrease additive increase increase

                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                        cut CongWin in half after loss event

                        8 Kbytes

                        16 Kbytes

                        24 Kbytes

                        time

                        congestionwindow

                        Long-lived TCP connection

                        3 Transport Layer 106Comp 361 Spring 2005

                        TCP Slow Start

                        When connection begins CongWin = 1 MSS

                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                        available bandwidth may be gtgt MSSRTT

                        desirable to quickly ramp up to respectable rate

                        When connection begins increase rate exponentially fast until first loss event

                        3 Transport Layer 107Comp 361 Spring 2005

                        TCP Slow Start (more)

                        When connection begins increase rate exponentially until first loss event

                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                        Summary initial rate is slow but ramps up exponentially fast

                        Host A

                        one segment

                        RTT

                        Host B

                        time

                        two segments

                        four segments

                        3 Transport Layer 108Comp 361 Spring 2005

                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                        3 Transport Layer 109Comp 361 Spring 2005

                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                        3 Transport Layer 110Comp 361 Spring 2005

                        Summary TCP Congestion Control

                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                        3 Transport Layer 111Comp 361 Spring 2005

                        The Big Picture

                        3 Transport Layer 112Comp 361 Spring 2005

                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                        ACK receipt for previously unackeddata

                        Slow Start (SS)

                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                        set state to ldquoCongestion Avoidancerdquo

                        Resulting in a doubling of CongWin every RTT

                        ACK receipt for previously unackeddata

                        CongestionAvoidance (CA)

                        CongWin = CongWin+MSS (MSSCongWin)

                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                        Loss event detected by triple duplicate ACK

                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                        Enter slow start

                        Duplicate ACK

                        SS or CA Increment duplicate ACK count for segment being acked

                        CongWin and Threshold not changed

                        3 Transport Layer 113Comp 361 Spring 2005

                        TCP throughput

                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                        3 Transport Layer 114Comp 361 Spring 2005

                        TCP Futures

                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                        L = 210-10 WowNew versions of TCP for high-speed needed

                        LRTTMSSsdot221

                        3 Transport Layer 115Comp 361 Spring 2005

                        TCP FairnessFairness goal if K TCP sessions share same

                        bottleneck link of bandwidth R each should have average rate of RK

                        TCP connection 1

                        bottleneckrouter

                        capacity R

                        TCP connection 2

                        3 Transport Layer 116Comp 361 Spring 2005

                        Why is TCP fairTwo competing sessions

                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                        R

                        R

                        equal bandwidth share

                        Connection 1 throughput

                        Conn

                        ecti

                        on 2

                        thr

                        ough

                        p ut

                        congestion avoidance additive increaseloss decrease window by factor of 2

                        congestion avoidance additive increaseloss decrease window by factor of 2

                        3 Transport Layer 117Comp 361 Spring 2005

                        Fairness (more)Fairness and UDP

                        Multimedia apps often do not use TCP

                        do not want rate throttled by congestion control

                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                        Current Research area How to keep UDP from congesting the internet

                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                        3 Transport Layer 118Comp 361 Spring 2005

                        TCP Latency ModelingNotation assumptions

                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                        modeling slow start

                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                        3 Transport Layer 119Comp 361 Spring 2005

                        Fixed Congestion Window (W)Two cases

                        1 WSR gt RTT + SR ACK for first segment in window returns before

                        windowrsquos worth of data sentLatency = 2RTT + OR

                        2 WSR lt RTT + SR ACK for first segment in window returns after

                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                        3 Transport Layer 120Comp 361 Spring 2005

                        Fixed congestion window (1)

                        First caseWSR gt RTT + SR ACK for

                        first segment in window returns before windowrsquos worth of data sent

                        latency = 2RTT + OR

                        3 Transport Layer 121Comp 361 Spring 2005

                        Fixed congestion window (2)

                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                        3 Transport Layer 122Comp 361 Spring 2005

                        TCP Latency Modeling Slow Start (1)

                        Now suppose window grows according to slow start(with no threshold and no loss events)

                        Will show that the delay for one object is

                        RS

                        RSRTTP

                        RORTTLatency P )12(2 minusminus⎥⎦

                        ⎤⎢⎣⎡ +++=

                        where P is the number of times TCP idles at server1min minus= KQP

                        - where Q is the number of times the server idlesif the object were of infinite size

                        - and K is the number of windows that cover the object

                        3 Transport Layer 123Comp 361 Spring 2005

                        TCP Latency Modeling Slow Start (2)

                        RTT

                        initiate TCPconnection

                        requestobject

                        first window= SR

                        second window= 2SR

                        third window= 4SR

                        fourth window= 8SR

                        completetransmissionobject

                        delivered

                        time atclient

                        time atserver

                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                        Server idles P=2 times

                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                        Server idles P = minK-1Q times

                        3 Transport Layer 124Comp 361 Spring 2005

                        TCP Latency Modeling (3)

                        ementacknowledg receivesserver until

                        segment send tostartsserver whenfrom time=+ RTTRS

                        RS

                        RSRTTPRTT

                        RO

                        RSRTT

                        RSRTT

                        RO

                        idleTimeRTTRO

                        P

                        kP

                        k

                        P

                        pp

                        )12(][2

                        ]2[2

                        2delay

                        1

                        1

                        1

                        minusminus+++=

                        minus+++=

                        ++=

                        minus

                        =

                        =

                        sum

                        sum

                        th window after the timeidle 2 1 kRSRTT

                        RS k =⎥⎦

                        ⎤⎢⎣⎡ minus+

                        +minus

                        window kth the transmit totime2 1 =minus

                        RSk

                        RTT

                        initiate TCPconnection

                        requestobject

                        first window= SR

                        second window= 2SR

                        third window= 4SR

                        fourth window= 8SR

                        completetransmissionobject

                        delivered

                        time atclient

                        time atserver

                        3 Transport Layer 125Comp 361 Spring 2005

                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                        How do we calculate K

                        ⎥⎥⎤

                        ⎢⎢⎡ +=

                        +ge=

                        geminus=

                        ge+++=

                        ge+++=minus

                        minus

                        )1(log

                        )1(logmin

                        12min

                        222min222min

                        2

                        2

                        110

                        110

                        SO

                        SOkk

                        SOk

                        SOkOSSSkK

                        k

                        k

                        k

                        L

                        L

                        Calculation of Q number of idles for infinite-size objectis similar

                        3 Transport Layer 126Comp 361 Spring 2005

                        HTTP ModelingAssume Web page consists of

                        1 base HTML page (of size O bits)M images (each of size O bits)

                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                        3 Transport Layer 127Comp 361 Spring 2005

                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                        02468

                        101214161820

                        28Kbps

                        100Kbps

                        1 Mbps 10Mbps

                        non-persistent

                        persistent

                        parallel non-persistent

                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                        3 Transport Layer 128Comp 361 Spring 2005

                        HTTP Response time (in seconds)

                        0

                        10

                        20

                        30

                        40

                        50

                        60

                        70

                        28Kbps

                        100Kbps

                        1 Mbps 10Mbps

                        non-persistent

                        persistent

                        parallel non-persistent

                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                        3 Transport Layer 129Comp 361 Spring 2005

                        Chapter 3 Summaryprinciples behind transport layer services

                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                        instantiation and implementation in the Internet

                        UDPTCP

                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                        • Chapter 3 Transport Layer last revised 160305
                        • Chapter 3 outline
                        • Transport services and protocols
                        • Transport vs network layer
                        • Transport-layer protocols
                        • Chapter 3 outline
                        • Multiplexingdemultiplexing
                        • Multiplexingdemultiplexing
                        • How demultiplexing works
                        • Connectionless demultiplexing
                        • Connectionless demux (cont)
                        • Connection-oriented demux
                        • Connection-oriented demux (cont)
                        • Connection-oriented demux Threaded Web Server
                        • Chapter 3 outline
                        • UDP User Datagram Protocol [RFC 768]
                        • UDP more
                        • UDP checksum
                        • Chapter 3 outline
                        • Principles of Reliable data transfer
                        • Reliable data transfer getting started
                        • Reliable data transfer getting started
                        • Incremental Improvements
                        • Rdt10 reliable transfer over a reliable channel
                        • Rdt20 channel with bit errors
                        • rdt20 FSM specification
                        • rdt20 operation with no errors
                        • rdt20 error scenario
                        • rdt20 has a fatal flaw
                        • rdt21 sender handles garbled ACKNAKs
                        • rdt21 receiver handles garbled ACKNAKs
                        • rdt21 discussion
                        • rdt22 a NAK-free protocol
                        • rdt22 sender receiver fragments
                        • rdt30 channels with errors and loss
                        • rdt30 sender
                        • rdt30 in action
                        • rdt30 in action
                        • Performance of rdt30
                        • rdt30 stop-and-wait operation
                        • Pipelined protocols
                        • Pipelined protocols
                        • Pipelining increased utilization
                        • Go-Back-N
                        • GBN Sender
                        • GBN sender extended FSM
                        • GBN receiver extended FSM
                        • More on receiver
                        • GBN inaction
                        • Selective Repeat
                        • Selective repeat sender receiver windows
                        • Selective repeat
                        • Selective repeat in action
                        • Selective repeat dilemma
                        • Chapter 3 outline
                        • TCP Overview RFCs 793 1122 1323 2018 2581
                        • More TCP Details
                        • Even More TCP Details
                        • TCP segment structure
                        • TCP seq rsquos and ACKs
                        • TCP Round Trip Time and Timeout
                        • TCP Round Trip Time and Timeout
                        • Example RTT estimation
                        • TCP Round Trip Time and Timeout
                        • Chapter 3 outline
                        • TCP reliable data transfer
                        • TCP sender events
                        • TCP sender(simplified)
                        • TCP retransmission scenarios
                        • TCP retransmission scenarios (more)
                        • TCP ACK generation [RFC 1122 RFC 2581]
                        • More on Sender Policies
                        • Fast Retransmit
                        • Fast retransmit algorithm
                        • TCP GBN or Selective Repeat
                        • Chapter 3 outline
                        • TCP Flow Control
                        • TCP Flow Control
                        • TCP segment structure
                        • TCP Flow control how it works
                        • Technical Issue
                        • Chapter 3 outline
                        • TCP Connection Management
                        • TCP Connection Management (cont)
                        • TCP Connection Management (cont)
                        • TCP Connection Management (cont)
                        • TCP Connection Management (cont)
                        • A few special cases
                        • Chapter 3 outline
                        • Principles of Congestion Control
                        • Causescosts of congestion scenario 1
                        • Causescosts of congestion scenario 2
                        • Causescosts of congestion scenario 3
                        • Causescosts of congestion scenario 3
                        • Approaches towards congestion control
                        • Case study ATM ABR congestion control
                        • Case study ATM ABR congestion control
                        • Chapter 3 outline
                        • TCP Congestion Control
                        • TCP AIMD
                        • TCP Slow Start
                        • TCP Slow Start (more)
                        • Summary TCP Congestion Control
                        • The Big Picture
                        • TCP sender congestion control
                        • TCP throughput
                        • TCP Futures
                        • TCP Fairness
                        • Why is TCP fair
                        • Fairness (more)
                        • TCP Latency Modeling
                        • Fixed Congestion Window (W)
                        • Fixed congestion window (1)
                        • Fixed congestion window (2)
                        • TCP Latency Modeling Slow Start (1)
                        • TCP Latency Modeling Slow Start (2)
                        • TCP Latency Modeling (3)
                        • TCP Latency Modeling (4)
                        • HTTP Modeling
                        • Chapter 3 Summary

                          3 Transport Layer 13Comp 361 Spring 2005

                          Connection-oriented demux(cont)

                          ClientIPB

                          P3

                          clientIP A

                          P1P1P3

                          serverIP C

                          SP 80DP 9157

                          SP 9157DP 80

                          SP 80DP 5775

                          SP 5775DP 80

                          P4

                          3 Transport Layer 14Comp 361 Spring 2005

                          Connection-oriented demux Threaded Web Server

                          ClientIPB

                          P1

                          clientIP A

                          P1P2

                          serverIP C

                          SP 9157DP 80

                          SP 9157DP 80

                          P4 P3

                          D-IPCS-IP AD-IPC

                          S-IP B

                          SP 5775DP 80

                          D-IPCS-IP B

                          3 Transport Layer 15Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 16Comp 361 Spring 2005

                          UDP User Datagram Protocol [RFC 768]

                          ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                          lostdelivered out of order to app

                          connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                          Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                          3 Transport Layer 17Comp 361 Spring 2005

                          UDP moreoften used for streaming multimedia apps

                          loss tolerantrate sensitive

                          other UDP uses (why)

                          DNS small delaySNMP stressful cond

                          reliable transfer over UDP add reliability at application layer

                          application-specific error recover

                          source port dest port

                          32 bits

                          Applicationdata

                          (message)

                          length checksumLength in

                          bytes of UDPsegmentincluding

                          header

                          UDP segment format

                          3 Transport Layer 18Comp 361 Spring 2005

                          UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                          segment

                          Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                          NO - error detectedYES - no error detected But maybe errors nonetheless More later

                          Receiver may choose to discard segment or send a warning to app in case error

                          Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                          3 Transport Layer 19Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 20Comp 361 Spring 2005

                          Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                          characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                          3 Transport Layer 21Comp 361 Spring 2005

                          Reliable data transfer getting started

                          sendside

                          receiveside

                          rdt_send() called from above (eg by app) Passed data to

                          deliver to receiver upper layer

                          udt_send() called by rdtto transfer packet over

                          unreliable channel to receiver

                          rdt_rcv() called when packet arrives on rcv-side of channel

                          deliver_data() called by rdt to deliver data to upper

                          3 Transport Layer 22Comp 361 Spring 2005

                          Reliable data transfer getting startedWersquoll

                          incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                          but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                          state1

                          state2

                          event causing state transitionactions taken on state transition

                          state when in this ldquostaterdquo next state

                          uniquely determined by next event

                          eventactions

                          3 Transport Layer 23Comp 361 Spring 2005

                          Incremental Improvements

                          rdt10 assumes every packet sent arrives and no errors introduced in transmission

                          rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                          rdt21 deals with corrupted ACKSNAKS

                          rdt22 like rdt21 but does not need NAKs

                          Rdt30 Allows packets to be lost

                          Rdt10 reliable transfer over a reliable channel

                          underlying channel perfectly reliableno bit errorsno loss of packets

                          separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                          Wait for call from above packet = make_pkt(data)

                          udt_send(packet)

                          rdt_send(data)extract (packetdata)deliver_data(data)

                          Wait for call from

                          below

                          rdt_rcv(packet)

                          sender receiver

                          3 Transport Layer 24Comp 361 Spring 2005

                          3 Transport Layer 25Comp 361 Spring 2005

                          Rdt20 channel with bit errors

                          underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                          the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                          new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                          3 Transport Layer 26Comp 361 Spring 2005

                          rdt20 FSM specification

                          Wait for call from above

                          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                          udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                          udt_send(NAK)

                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                          Wait for ACK or

                          NAK

                          rdt_send(data)

                          receiver

                          Wait for call from

                          below

                          Λ

                          sender

                          3 Transport Layer 27Comp 361 Spring 2005

                          rdt20 operation with no errors

                          Wait for call from above

                          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                          udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                          udt_send(NAK)

                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                          Wait for ACK or

                          NAK

                          Wait for call from

                          below

                          rdt_send(data)

                          Λ

                          3 Transport Layer 28Comp 361 Spring 2005

                          rdt20 error scenario

                          Wait for call from above

                          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                          udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                          udt_send(NAK)

                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                          Wait for ACK or

                          NAK

                          Wait for call from

                          below

                          rdt_send(data)

                          Λ

                          3 Transport Layer 29Comp 361 Spring 2005

                          rdt20 has a fatal flawWhat happens if ACKNAK

                          corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                          What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                          Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                          Sender sends one packet then waits for receiver response

                          stop and wait

                          3 Transport Layer 30Comp 361 Spring 2005

                          Sender whenever sender receives control message it sends a packet to receiver

                          A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                          Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                          Note ACKNAK do not contain sequence

                          3 Transport Layer 31Comp 361 Spring 2005

                          rdt21 sender handles garbled ACKNAKs

                          Wait for call 0 from

                          above

                          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                          rdt_send(data)

                          Wait for ACK or NAK 0 udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                          rdt_send(data)

                          udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                          Wait forcall 1 from

                          above

                          Wait for ACK or NAK 1

                          ΛΛ

                          3 Transport Layer 32Comp 361 Spring 2005

                          rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                          ampamp has_seq0(rcvpkt)

                          Wait for 0 from below

                          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                          Wait for 1 from below

                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                          3 Transport Layer 33Comp 361 Spring 2005

                          rdt21 discussion

                          Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                          state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                          Receivermust check if received packet is duplicate

                          state indicates whether 0 or 1 is expected pkt seq

                          note receiver can notknow if its last ACKNAK received OK at sender

                          3 Transport Layer 34Comp 361 Spring 2005

                          rdt22 a NAK-free protocol

                          same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                          receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                          duplicate ACK at sender results in same action as NAK retransmit current pkt

                          3 Transport Layer 35Comp 361 Spring 2005

                          rdt22 sender receiver fragments

                          Wait for call 0 from

                          above

                          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                          rdt_send(data)

                          udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                          isACK(rcvpkt1) )

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                          Wait for ACK

                          0sender FSM

                          fragment

                          Wait for 0 from below

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                          has_seq1(rcvpkt))

                          udt_send(sndpkt)receiver FSM

                          fragment

                          Λ

                          3 Transport Layer 36Comp 361 Spring 2005

                          rdt30 channels with errors and loss

                          New assumptionunderlying channel can also lose packets (data or ACKs)

                          checksum seq ACKs retransmissions will be of help but not enough

                          Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                          Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                          retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                          requires countdown timer

                          3 Transport Layer 37Comp 361 Spring 2005

                          rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                          rdt_send(data)

                          Wait for

                          ACK0

                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                          Wait for call 1 from

                          above

                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                          rdt_send(data)

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                          stop_timerstop_timer

                          udt_send(sndpkt)start_timer

                          timeout

                          udt_send(sndpkt)start_timer

                          timeout

                          rdt_rcv(rcvpkt)

                          Wait for call 0from

                          above

                          Wait for

                          ACK1

                          Λrdt_rcv(rcvpkt)

                          ΛΛ

                          Λ

                          3 Transport Layer 38Comp 361 Spring 2005

                          rdt30 in action

                          3 Transport Layer 39Comp 361 Spring 2005

                          rdt30 in action

                          3 Transport Layer 40Comp 361 Spring 2005

                          Performance of rdt30

                          rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                          L (packet length in bits)R (transmission rate bps)

                          8kbpkt109 bsec

                          Ttransmit = = = 8 microsec

                          U sender =

                          00830008

                          = 000027 L R RTT + L R

                          =

                          U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                          rdt30 stop-and-wait operation

                          first packet bit transmitted t = 0

                          sender receiver

                          RTT

                          last packet bit transmitted t = L R

                          first packet bit arriveslast packet bit arrives send ACK

                          ACK arrives send next packet t = RTT + L R

                          U sender =

                          008 30008

                          = 000027 L R RTT + L R

                          =

                          3 Transport Layer 41Comp 361 Spring 2005

                          3 Transport Layer 42Comp 361 Spring 2005

                          Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                          range of sequence numbers must be increasedbuffering at sender andor receiver

                          3 Transport Layer 43Comp 361 Spring 2005

                          Pipelined protocols

                          Advantage much better bandwidth utilization than stop-and-wait

                          Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                          Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                          Note TCP is not exactly either

                          Pipelining increased utilization

                          first packet bit transmitted t = 0

                          sender receiver

                          RTT

                          last bit transmitted t = L R

                          first packet bit arriveslast packet bit arrives send ACK

                          ACK arrives send next packet t = RTT + L R

                          last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                          U sender =

                          02430008

                          = 00008 3 L R RTT + L R

                          =

                          Increase utilizationby a factor of 3

                          3 Transport Layer 44Comp 361 Spring 2005

                          3 Transport Layer 45Comp 361 Spring 2005

                          Go-Back-NSender

                          k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                          ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                          Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                          3 Transport Layer 46Comp 361 Spring 2005

                          GBN Sender

                          rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                          Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                          Timeout resends ALL packets that have been sent but not yet acknowledged

                          This is only event that triggers resend

                          3 Transport Layer 47Comp 361 Spring 2005

                          GBN sender extended FSMrdt_send(data)

                          Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                          timeout

                          if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                          start_timernextseqnum++

                          elserefuse_data(data)

                          base = getacknum(rcvpkt)+1If (base == nextseqnum)

                          stop_timerelse

                          start_timer

                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                          base=1nextseqnum=1

                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                          Λ

                          3 Transport Layer 48Comp 361 Spring 2005

                          GBN receiver extended FSM

                          Wait

                          udt_send(sndpkt)default

                          rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                          expectedseqnum=1sndpkt =

                          make_pkt(0ACKchksum)

                          Λ

                          If expected packet receivedSend ACK and deliver packet upstairs

                          If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                          3 Transport Layer 49Comp 361 Spring 2005

                          More on receiver

                          The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                          3 Transport Layer 50Comp 361 Spring 2005

                          GBN inaction

                          GBN is easy to code but might have performance problems

                          In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                          Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                          3 Transport Layer 51Comp 361 Spring 2005

                          3 Transport Layer 52Comp 361 Spring 2005

                          Selective Repeat

                          receiver individually acknowledges all correctly received pkts

                          buffers pkts as needed for eventual in-order delivery to upper layer

                          sender only resends pkts for which ACK not received

                          sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                          sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                          3 Transport Layer 53Comp 361 Spring 2005

                          Selective repeat sender receiver windows

                          3 Transport Layer 54Comp 361 Spring 2005

                          Selective repeat

                          pkt n in [rcvbase rcvbase+N-1]

                          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                          pkt n in [rcvbase-Nrcvbase-1]

                          ACK(n) (note this is a reACK)

                          otherwiseignore

                          receiverdata from above

                          if next available seq in window send pkt

                          timeout(n)resend pkt n restart timer

                          ACK(n) in [sendbasesendbase+N]

                          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                          sender

                          3 Transport Layer 55Comp 361 Spring 2005

                          Selective repeat in action

                          3 Transport Layer 56Comp 361 Spring 2005

                          Selective repeatdilemma

                          Example seq rsquos 0 1 2 3window size=3

                          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                          Q what is relationship between seq size and window size

                          3 Transport Layer 57Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 58Comp 361 Spring 2005

                          TCP Overview RFCs 793 1122 1323 2018 2581

                          full duplex databi-directional data flow in same connectionMSS maximum segment size

                          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                          flow controlledsender will not overwhelm receiver

                          point-to-pointone sender one receiver

                          reliable in-order byte steam

                          no ldquomessage boundariesrdquopipelined

                          TCP congestion and flow control set window size

                          send amp receive buffers

                          socketdoor

                          TCPsend buffer

                          TCPreceive buffer

                          socketdoor

                          segment

                          applicationwrites data

                          applicationreads data

                          3 Transport Layer 59Comp 361 Spring 2005

                          More TCP DetailsMaximum Segment Size (MSS)

                          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                          Application Data + TCP Header = TCP Segment

                          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                          (again no payload)Client responds with third special segment

                          This can contain payload

                          3 Transport Layer 60Comp 361 Spring 2005

                          Even More TCP Details

                          A TCP connection between client and server creates in both client and server

                          (i) buffers(ii) variables and

                          (iii) a socket connection to process

                          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                          any of the network elements between the host and server

                          3 Transport Layer 61Comp 361 Spring 2005

                          TCP segment structure

                          source port dest port

                          32 bits

                          applicationdata

                          (variable length)

                          sequence numberacknowledgement number

                          Receive windowUrg data pnterchecksum

                          FSRPAUheadlen

                          notused

                          Options (variable length)

                          URG urgent data (generally not used)

                          ACK ACK valid

                          PSH push data now(generally not used)

                          RST SYN FINconnection estab(setup teardown

                          commands)

                          bytes rcvr willingto accept

                          Internetchecksum

                          (as in UDP)

                          countingby bytes of data(not segments)

                          3 Transport Layer 62Comp 361 Spring 2005

                          TCP seq rsquos and ACKsSeq rsquos

                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                          ACKsseq of next byte expected from other sidecumulative ACK

                          Q how receiver handles out-of-order segments

                          A TCP spec doesnrsquot say - up to implementer

                          Host BHost A

                          Seq=42 ACK=79 data = lsquoCrsquo

                          Seq=79 ACK=43 data = lsquoCrsquo

                          Seq=43 ACK=80

                          Usertypes

                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                          back lsquoCrsquo

                          host ACKsreceipt

                          of echoedlsquoCrsquo

                          timesimple telnet scenario

                          3 Transport Layer 63Comp 361 Spring 2005

                          TCP Round Trip Time and Timeout

                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                          average several recent measurements not just current SampleRTT

                          Q how to set TCP timeout valuelonger than RTT

                          but RTT variestoo short premature timeout

                          unnecessary retransmissions

                          too long slow reaction to segment loss

                          3 Transport Layer 64Comp 361 Spring 2005

                          TCP Round Trip Time and Timeout

                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                          3 Transport Layer 65Comp 361 Spring 2005

                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                          100

                          150

                          200

                          250

                          300

                          350

                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                          time (seconnds)

                          RTT

                          (mill

                          iseco

                          nds)

                          SampleRTT Estimated RTT

                          3 Transport Layer 66Comp 361 Spring 2005

                          TCP Round Trip Time and Timeout

                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                          (typically β = 025)

                          Then set timeout interval

                          TimeoutInterval = EstimatedRTT + 4DevRTT

                          3 Transport Layer 67Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 68Comp 361 Spring 2005

                          TCP reliable data transfer

                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                          Retransmissions are triggered by

                          timeout eventsduplicate acks

                          Initially consider simplified TCP sender

                          ignore duplicate acksignore flow control congestion control

                          3 Transport Layer 69Comp 361 Spring 2005

                          TCP sender eventsdata rcvd from app

                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                          timeoutretransmit segment that caused timeoutrestart timer

                          Ack rcvdIf acknowledges previously unackedsegments

                          update what is known to be ackedstart timer if there are outstanding segments

                          TCP sender(simplified)

                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                          loop (forever) switch(event)

                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                          event timer timeoutretransmit not-yet-acknowledged segment with

                          smallest sequence numberstart timer

                          event ACK received with ACK field value of y if (y gt SendBase)

                          SendBase = yif (there are currently not-yet-acknowledged segments)

                          start timer

                          end of loop forever

                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                          3 Transport Layer 70Comp 361 Spring 2005

                          3 Transport Layer 71Comp 361 Spring 2005

                          TCP retransmission scenariosHost A

                          Seq=100 20 bytes data

                          ACK=100

                          timepremature timeout

                          Host B

                          Seq=92 8 bytes data

                          ACK=120

                          Seq=92 8 bytes data

                          Seq=

                          92 t

                          imeo

                          ut

                          ACK=120

                          Host A

                          Seq=92 8 bytes data

                          ACK=100

                          loss

                          tim

                          eout

                          lost ACK scenario

                          Host B

                          X

                          Seq=92 8 bytes data

                          ACK=100

                          time

                          SendBase= 120

                          SendBase= 120

                          Sendbase= 100

                          Seq=

                          92 t

                          imeo

                          utSendBase

                          = 100

                          3 Transport Layer 72Comp 361 Spring 2005

                          TCP retransmission scenarios (more)Host A

                          Seq=92 8 bytes data

                          ACK=100

                          loss

                          tim

                          eout

                          Cumulative ACK scenario

                          Host B

                          X

                          Seq=100 20 bytes data

                          ACK=120

                          time

                          SendBase= 120

                          3 Transport Layer 73Comp 361 Spring 2005

                          TCP ACK generation [RFC 1122 RFC 2581]

                          Event at Receiver

                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                          Arrival of segment that partially or completely fills gap

                          TCP Receiver action

                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                          Immediately send single cumulative ACK ACKing both in-order segments

                          Immediately send duplicate ACK indicating seq of next expected byte

                          Immediate send ACK provided thatsegment starts at lower end of gap

                          3 Transport Layer 74Comp 361 Spring 2005

                          More on Sender Policies

                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                          3 Transport Layer 75Comp 361 Spring 2005

                          Fast Retransmit

                          Time-out period often relatively long

                          long delay before resending lost packet

                          Detect lost segments via duplicate ACKs

                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                          fast retransmit resend segment before timer expires

                          3 Transport Layer 76Comp 361 Spring 2005

                          Fast retransmit algorithm

                          event ACK received with ACK field value of y if (y gt SendBase)

                          SendBase = yif (there are currently not-yet-acknowledged segments)

                          start timer

                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                          resend segment with sequence number y

                          a duplicate ACK for already ACKed segment

                          fast retransmit

                          3 Transport Layer 77Comp 361 Spring 2005

                          TCP GBN or Selective Repeat

                          Basic TCP looks a lot like GBN

                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                          This looks a lot like Selective Repeat

                          TCP is a hybrid

                          3 Transport Layer 78Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 79Comp 361 Spring 2005

                          TCP Flow Control

                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                          3 Transport Layer 80Comp 361 Spring 2005

                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                          transmitting too muchtoo fast

                          flow controlreceive side of TCP connection has a receive buffer

                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                          app process may be slow at reading from buffer

                          3 Transport Layer 81Comp 361 Spring 2005

                          TCP segment structure

                          source port dest port

                          32 bits

                          applicationdata

                          (variable length)

                          sequence numberacknowledgement number

                          Receive windowUrg data pnterchecksum

                          FSRPAUheadlen

                          notused

                          Options (variable length)

                          URG urgent data (generally not used)

                          ACK ACK valid

                          PSH push data now(generally not used)

                          RST SYN FINconnection estab(setup teardown

                          commands)

                          bytes rcvr willingto accept

                          Internetchecksum

                          (as in UDP)

                          countingby bytes of data(not segments)

                          3 Transport Layer 82Comp 361 Spring 2005

                          TCP Flow control how it works

                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                          LastByteRead]

                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                          guarantees receive buffer doesnrsquot overflow

                          3 Transport Layer 83Comp 361 Spring 2005

                          Technical Issue

                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                          3 Transport Layer 84Comp 361 Spring 2005

                          Note on UDP

                          UDP has no flow control

                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                          3 Transport Layer 85Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 86Comp 361 Spring 2005

                          TCP Connection Management

                          Three way handshakeStep 1 client end system sends

                          TCP SYN control segment to server

                          specifies client_isn the initial seq No application data

                          Step 2 server end system receives SYN replies with SYNACK control segment

                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                          seq sbuffers flow control info (eg RcvWindow)

                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                          3 Transport Layer 87Comp 361 Spring 2005

                          TCP Connection Management (cont)

                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                          Allocate buffersAllocates buffersCan include application data

                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                          clientConnection request (SYN=1 seq=client_isn)

                          server

                          Connection granted (SYN=1 server_isn

                          ACK (SYN=0 seq=client_isn+1)

                          ack=client_isn+1)

                          ack=server_isn+1

                          3 Transport Layer 88Comp 361 Spring 2005

                          TCP Connection Management (cont)

                          Closing a connection

                          client closes socketclientSocketclose()

                          Step 1 client end system sends TCP FIN control segment to server

                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                          client

                          FIN

                          server

                          ACK

                          ACK

                          FIN

                          close

                          close

                          closed

                          tim

                          ed w

                          ait

                          3 Transport Layer 89Comp 361 Spring 2005

                          TCP Connection Management (cont)

                          Step 3 client receives FIN replies with ACK

                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                          Closes down after timed-wait

                          Step 4 server receives ACK Connection closed

                          Note with small modification can handle simultaneous FINs

                          client

                          FIN

                          server

                          ACK

                          ACK

                          FIN

                          closing

                          closing

                          closed

                          tim

                          ed w

                          ait

                          closed

                          3 Transport Layer 90Comp 361 Spring 2005

                          TCP Connection Management (cont)

                          ExampleTCP serverlifecycle

                          Example TCP clientlifecycle

                          3 Transport Layer 91Comp 361 Spring 2005

                          A few special cases

                          Have not discussed what happens if both client and server decide to close down connection at same time

                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                          3 Transport Layer 92Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 93Comp 361 Spring 2005

                          Principles of Congestion Control

                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                          a top-10 problem

                          3 Transport Layer 94Comp 361 Spring 2005

                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                          large delays when congestedmaximum achievable throughput

                          3 Transport Layer 95Comp 361 Spring 2005

                          Causescosts of congestion scenario 2

                          one router finite buffers sender retransmission of lost packet

                          3 Transport Layer 96Comp 361 Spring 2005

                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                          λin λout=

                          λin λoutgtλ

                          inλout

                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                          (c)(a) (b)

                          3 Transport Layer 97Comp 361 Spring 2005

                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                          λin

                          Q what happens as and increase λ

                          in

                          3 Transport Layer 98Comp 361 Spring 2005

                          Causescosts of congestion scenario 3

                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                          3 Transport Layer 99Comp 361 Spring 2005

                          Approaches towards congestion control

                          Two broad approaches towards congestion control

                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                          Network-assisted congestion controlrouters provide feedback to end systems

                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                          3 Transport Layer 100Comp 361 Spring 2005

                          Case study ATM ABR congestion control

                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                          RM cells returned to sender by receiver with bits intact

                          small exception ndash see next page

                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                          sender should use available bandwidth

                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                          3 Transport Layer 101Comp 361 Spring 2005

                          Case study ATM ABR congestion control

                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                          3 Transport Layer 102Comp 361 Spring 2005

                          Chapter 3 outline

                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                          35 Connection-oriented transport TCP

                          segment structurereliable data transferflow controlconnection management

                          36 Principles of congestion control37 TCP congestion control

                          3 Transport Layer 103Comp 361 Spring 2005

                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                          Congwin

                          w segments each with MSS bytes sent in one RTT

                          throughput = w MSSRTT Bytessec

                          3 Transport Layer 104Comp 361 Spring 2005

                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                          LastByteSent-LastByteAcked le CongWin

                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                          3 Transport Layer 105Comp 361 Spring 2005

                          TCP AIMDmultiplicative decrease additive increase increase

                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                          cut CongWin in half after loss event

                          8 Kbytes

                          16 Kbytes

                          24 Kbytes

                          time

                          congestionwindow

                          Long-lived TCP connection

                          3 Transport Layer 106Comp 361 Spring 2005

                          TCP Slow Start

                          When connection begins CongWin = 1 MSS

                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                          available bandwidth may be gtgt MSSRTT

                          desirable to quickly ramp up to respectable rate

                          When connection begins increase rate exponentially fast until first loss event

                          3 Transport Layer 107Comp 361 Spring 2005

                          TCP Slow Start (more)

                          When connection begins increase rate exponentially until first loss event

                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                          Summary initial rate is slow but ramps up exponentially fast

                          Host A

                          one segment

                          RTT

                          Host B

                          time

                          two segments

                          four segments

                          3 Transport Layer 108Comp 361 Spring 2005

                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                          3 Transport Layer 109Comp 361 Spring 2005

                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                          3 Transport Layer 110Comp 361 Spring 2005

                          Summary TCP Congestion Control

                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                          3 Transport Layer 111Comp 361 Spring 2005

                          The Big Picture

                          3 Transport Layer 112Comp 361 Spring 2005

                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                          ACK receipt for previously unackeddata

                          Slow Start (SS)

                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                          set state to ldquoCongestion Avoidancerdquo

                          Resulting in a doubling of CongWin every RTT

                          ACK receipt for previously unackeddata

                          CongestionAvoidance (CA)

                          CongWin = CongWin+MSS (MSSCongWin)

                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                          Loss event detected by triple duplicate ACK

                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                          Enter slow start

                          Duplicate ACK

                          SS or CA Increment duplicate ACK count for segment being acked

                          CongWin and Threshold not changed

                          3 Transport Layer 113Comp 361 Spring 2005

                          TCP throughput

                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                          3 Transport Layer 114Comp 361 Spring 2005

                          TCP Futures

                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                          L = 210-10 WowNew versions of TCP for high-speed needed

                          LRTTMSSsdot221

                          3 Transport Layer 115Comp 361 Spring 2005

                          TCP FairnessFairness goal if K TCP sessions share same

                          bottleneck link of bandwidth R each should have average rate of RK

                          TCP connection 1

                          bottleneckrouter

                          capacity R

                          TCP connection 2

                          3 Transport Layer 116Comp 361 Spring 2005

                          Why is TCP fairTwo competing sessions

                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                          R

                          R

                          equal bandwidth share

                          Connection 1 throughput

                          Conn

                          ecti

                          on 2

                          thr

                          ough

                          p ut

                          congestion avoidance additive increaseloss decrease window by factor of 2

                          congestion avoidance additive increaseloss decrease window by factor of 2

                          3 Transport Layer 117Comp 361 Spring 2005

                          Fairness (more)Fairness and UDP

                          Multimedia apps often do not use TCP

                          do not want rate throttled by congestion control

                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                          Current Research area How to keep UDP from congesting the internet

                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                          3 Transport Layer 118Comp 361 Spring 2005

                          TCP Latency ModelingNotation assumptions

                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                          modeling slow start

                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                          3 Transport Layer 119Comp 361 Spring 2005

                          Fixed Congestion Window (W)Two cases

                          1 WSR gt RTT + SR ACK for first segment in window returns before

                          windowrsquos worth of data sentLatency = 2RTT + OR

                          2 WSR lt RTT + SR ACK for first segment in window returns after

                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                          3 Transport Layer 120Comp 361 Spring 2005

                          Fixed congestion window (1)

                          First caseWSR gt RTT + SR ACK for

                          first segment in window returns before windowrsquos worth of data sent

                          latency = 2RTT + OR

                          3 Transport Layer 121Comp 361 Spring 2005

                          Fixed congestion window (2)

                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                          3 Transport Layer 122Comp 361 Spring 2005

                          TCP Latency Modeling Slow Start (1)

                          Now suppose window grows according to slow start(with no threshold and no loss events)

                          Will show that the delay for one object is

                          RS

                          RSRTTP

                          RORTTLatency P )12(2 minusminus⎥⎦

                          ⎤⎢⎣⎡ +++=

                          where P is the number of times TCP idles at server1min minus= KQP

                          - where Q is the number of times the server idlesif the object were of infinite size

                          - and K is the number of windows that cover the object

                          3 Transport Layer 123Comp 361 Spring 2005

                          TCP Latency Modeling Slow Start (2)

                          RTT

                          initiate TCPconnection

                          requestobject

                          first window= SR

                          second window= 2SR

                          third window= 4SR

                          fourth window= 8SR

                          completetransmissionobject

                          delivered

                          time atclient

                          time atserver

                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                          Server idles P=2 times

                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                          Server idles P = minK-1Q times

                          3 Transport Layer 124Comp 361 Spring 2005

                          TCP Latency Modeling (3)

                          ementacknowledg receivesserver until

                          segment send tostartsserver whenfrom time=+ RTTRS

                          RS

                          RSRTTPRTT

                          RO

                          RSRTT

                          RSRTT

                          RO

                          idleTimeRTTRO

                          P

                          kP

                          k

                          P

                          pp

                          )12(][2

                          ]2[2

                          2delay

                          1

                          1

                          1

                          minusminus+++=

                          minus+++=

                          ++=

                          minus

                          =

                          =

                          sum

                          sum

                          th window after the timeidle 2 1 kRSRTT

                          RS k =⎥⎦

                          ⎤⎢⎣⎡ minus+

                          +minus

                          window kth the transmit totime2 1 =minus

                          RSk

                          RTT

                          initiate TCPconnection

                          requestobject

                          first window= SR

                          second window= 2SR

                          third window= 4SR

                          fourth window= 8SR

                          completetransmissionobject

                          delivered

                          time atclient

                          time atserver

                          3 Transport Layer 125Comp 361 Spring 2005

                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                          How do we calculate K

                          ⎥⎥⎤

                          ⎢⎢⎡ +=

                          +ge=

                          geminus=

                          ge+++=

                          ge+++=minus

                          minus

                          )1(log

                          )1(logmin

                          12min

                          222min222min

                          2

                          2

                          110

                          110

                          SO

                          SOkk

                          SOk

                          SOkOSSSkK

                          k

                          k

                          k

                          L

                          L

                          Calculation of Q number of idles for infinite-size objectis similar

                          3 Transport Layer 126Comp 361 Spring 2005

                          HTTP ModelingAssume Web page consists of

                          1 base HTML page (of size O bits)M images (each of size O bits)

                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                          3 Transport Layer 127Comp 361 Spring 2005

                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                          02468

                          101214161820

                          28Kbps

                          100Kbps

                          1 Mbps 10Mbps

                          non-persistent

                          persistent

                          parallel non-persistent

                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                          3 Transport Layer 128Comp 361 Spring 2005

                          HTTP Response time (in seconds)

                          0

                          10

                          20

                          30

                          40

                          50

                          60

                          70

                          28Kbps

                          100Kbps

                          1 Mbps 10Mbps

                          non-persistent

                          persistent

                          parallel non-persistent

                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                          3 Transport Layer 129Comp 361 Spring 2005

                          Chapter 3 Summaryprinciples behind transport layer services

                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                          instantiation and implementation in the Internet

                          UDPTCP

                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                          • Chapter 3 Transport Layer last revised 160305
                          • Chapter 3 outline
                          • Transport services and protocols
                          • Transport vs network layer
                          • Transport-layer protocols
                          • Chapter 3 outline
                          • Multiplexingdemultiplexing
                          • Multiplexingdemultiplexing
                          • How demultiplexing works
                          • Connectionless demultiplexing
                          • Connectionless demux (cont)
                          • Connection-oriented demux
                          • Connection-oriented demux (cont)
                          • Connection-oriented demux Threaded Web Server
                          • Chapter 3 outline
                          • UDP User Datagram Protocol [RFC 768]
                          • UDP more
                          • UDP checksum
                          • Chapter 3 outline
                          • Principles of Reliable data transfer
                          • Reliable data transfer getting started
                          • Reliable data transfer getting started
                          • Incremental Improvements
                          • Rdt10 reliable transfer over a reliable channel
                          • Rdt20 channel with bit errors
                          • rdt20 FSM specification
                          • rdt20 operation with no errors
                          • rdt20 error scenario
                          • rdt20 has a fatal flaw
                          • rdt21 sender handles garbled ACKNAKs
                          • rdt21 receiver handles garbled ACKNAKs
                          • rdt21 discussion
                          • rdt22 a NAK-free protocol
                          • rdt22 sender receiver fragments
                          • rdt30 channels with errors and loss
                          • rdt30 sender
                          • rdt30 in action
                          • rdt30 in action
                          • Performance of rdt30
                          • rdt30 stop-and-wait operation
                          • Pipelined protocols
                          • Pipelined protocols
                          • Pipelining increased utilization
                          • Go-Back-N
                          • GBN Sender
                          • GBN sender extended FSM
                          • GBN receiver extended FSM
                          • More on receiver
                          • GBN inaction
                          • Selective Repeat
                          • Selective repeat sender receiver windows
                          • Selective repeat
                          • Selective repeat in action
                          • Selective repeat dilemma
                          • Chapter 3 outline
                          • TCP Overview RFCs 793 1122 1323 2018 2581
                          • More TCP Details
                          • Even More TCP Details
                          • TCP segment structure
                          • TCP seq rsquos and ACKs
                          • TCP Round Trip Time and Timeout
                          • TCP Round Trip Time and Timeout
                          • Example RTT estimation
                          • TCP Round Trip Time and Timeout
                          • Chapter 3 outline
                          • TCP reliable data transfer
                          • TCP sender events
                          • TCP sender(simplified)
                          • TCP retransmission scenarios
                          • TCP retransmission scenarios (more)
                          • TCP ACK generation [RFC 1122 RFC 2581]
                          • More on Sender Policies
                          • Fast Retransmit
                          • Fast retransmit algorithm
                          • TCP GBN or Selective Repeat
                          • Chapter 3 outline
                          • TCP Flow Control
                          • TCP Flow Control
                          • TCP segment structure
                          • TCP Flow control how it works
                          • Technical Issue
                          • Chapter 3 outline
                          • TCP Connection Management
                          • TCP Connection Management (cont)
                          • TCP Connection Management (cont)
                          • TCP Connection Management (cont)
                          • TCP Connection Management (cont)
                          • A few special cases
                          • Chapter 3 outline
                          • Principles of Congestion Control
                          • Causescosts of congestion scenario 1
                          • Causescosts of congestion scenario 2
                          • Causescosts of congestion scenario 3
                          • Causescosts of congestion scenario 3
                          • Approaches towards congestion control
                          • Case study ATM ABR congestion control
                          • Case study ATM ABR congestion control
                          • Chapter 3 outline
                          • TCP Congestion Control
                          • TCP AIMD
                          • TCP Slow Start
                          • TCP Slow Start (more)
                          • Summary TCP Congestion Control
                          • The Big Picture
                          • TCP sender congestion control
                          • TCP throughput
                          • TCP Futures
                          • TCP Fairness
                          • Why is TCP fair
                          • Fairness (more)
                          • TCP Latency Modeling
                          • Fixed Congestion Window (W)
                          • Fixed congestion window (1)
                          • Fixed congestion window (2)
                          • TCP Latency Modeling Slow Start (1)
                          • TCP Latency Modeling Slow Start (2)
                          • TCP Latency Modeling (3)
                          • TCP Latency Modeling (4)
                          • HTTP Modeling
                          • Chapter 3 Summary

                            3 Transport Layer 14Comp 361 Spring 2005

                            Connection-oriented demux Threaded Web Server

                            ClientIPB

                            P1

                            clientIP A

                            P1P2

                            serverIP C

                            SP 9157DP 80

                            SP 9157DP 80

                            P4 P3

                            D-IPCS-IP AD-IPC

                            S-IP B

                            SP 5775DP 80

                            D-IPCS-IP B

                            3 Transport Layer 15Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 16Comp 361 Spring 2005

                            UDP User Datagram Protocol [RFC 768]

                            ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                            lostdelivered out of order to app

                            connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                            Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                            3 Transport Layer 17Comp 361 Spring 2005

                            UDP moreoften used for streaming multimedia apps

                            loss tolerantrate sensitive

                            other UDP uses (why)

                            DNS small delaySNMP stressful cond

                            reliable transfer over UDP add reliability at application layer

                            application-specific error recover

                            source port dest port

                            32 bits

                            Applicationdata

                            (message)

                            length checksumLength in

                            bytes of UDPsegmentincluding

                            header

                            UDP segment format

                            3 Transport Layer 18Comp 361 Spring 2005

                            UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                            segment

                            Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                            NO - error detectedYES - no error detected But maybe errors nonetheless More later

                            Receiver may choose to discard segment or send a warning to app in case error

                            Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                            3 Transport Layer 19Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 20Comp 361 Spring 2005

                            Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                            characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                            3 Transport Layer 21Comp 361 Spring 2005

                            Reliable data transfer getting started

                            sendside

                            receiveside

                            rdt_send() called from above (eg by app) Passed data to

                            deliver to receiver upper layer

                            udt_send() called by rdtto transfer packet over

                            unreliable channel to receiver

                            rdt_rcv() called when packet arrives on rcv-side of channel

                            deliver_data() called by rdt to deliver data to upper

                            3 Transport Layer 22Comp 361 Spring 2005

                            Reliable data transfer getting startedWersquoll

                            incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                            but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                            state1

                            state2

                            event causing state transitionactions taken on state transition

                            state when in this ldquostaterdquo next state

                            uniquely determined by next event

                            eventactions

                            3 Transport Layer 23Comp 361 Spring 2005

                            Incremental Improvements

                            rdt10 assumes every packet sent arrives and no errors introduced in transmission

                            rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                            rdt21 deals with corrupted ACKSNAKS

                            rdt22 like rdt21 but does not need NAKs

                            Rdt30 Allows packets to be lost

                            Rdt10 reliable transfer over a reliable channel

                            underlying channel perfectly reliableno bit errorsno loss of packets

                            separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                            Wait for call from above packet = make_pkt(data)

                            udt_send(packet)

                            rdt_send(data)extract (packetdata)deliver_data(data)

                            Wait for call from

                            below

                            rdt_rcv(packet)

                            sender receiver

                            3 Transport Layer 24Comp 361 Spring 2005

                            3 Transport Layer 25Comp 361 Spring 2005

                            Rdt20 channel with bit errors

                            underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                            the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                            new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                            3 Transport Layer 26Comp 361 Spring 2005

                            rdt20 FSM specification

                            Wait for call from above

                            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                            udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                            udt_send(NAK)

                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                            Wait for ACK or

                            NAK

                            rdt_send(data)

                            receiver

                            Wait for call from

                            below

                            Λ

                            sender

                            3 Transport Layer 27Comp 361 Spring 2005

                            rdt20 operation with no errors

                            Wait for call from above

                            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                            udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                            udt_send(NAK)

                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                            Wait for ACK or

                            NAK

                            Wait for call from

                            below

                            rdt_send(data)

                            Λ

                            3 Transport Layer 28Comp 361 Spring 2005

                            rdt20 error scenario

                            Wait for call from above

                            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                            udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                            udt_send(NAK)

                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                            Wait for ACK or

                            NAK

                            Wait for call from

                            below

                            rdt_send(data)

                            Λ

                            3 Transport Layer 29Comp 361 Spring 2005

                            rdt20 has a fatal flawWhat happens if ACKNAK

                            corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                            What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                            Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                            Sender sends one packet then waits for receiver response

                            stop and wait

                            3 Transport Layer 30Comp 361 Spring 2005

                            Sender whenever sender receives control message it sends a packet to receiver

                            A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                            Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                            Note ACKNAK do not contain sequence

                            3 Transport Layer 31Comp 361 Spring 2005

                            rdt21 sender handles garbled ACKNAKs

                            Wait for call 0 from

                            above

                            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                            rdt_send(data)

                            Wait for ACK or NAK 0 udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                            rdt_send(data)

                            udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                            Wait forcall 1 from

                            above

                            Wait for ACK or NAK 1

                            ΛΛ

                            3 Transport Layer 32Comp 361 Spring 2005

                            rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                            ampamp has_seq0(rcvpkt)

                            Wait for 0 from below

                            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                            Wait for 1 from below

                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                            3 Transport Layer 33Comp 361 Spring 2005

                            rdt21 discussion

                            Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                            state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                            Receivermust check if received packet is duplicate

                            state indicates whether 0 or 1 is expected pkt seq

                            note receiver can notknow if its last ACKNAK received OK at sender

                            3 Transport Layer 34Comp 361 Spring 2005

                            rdt22 a NAK-free protocol

                            same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                            receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                            duplicate ACK at sender results in same action as NAK retransmit current pkt

                            3 Transport Layer 35Comp 361 Spring 2005

                            rdt22 sender receiver fragments

                            Wait for call 0 from

                            above

                            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                            rdt_send(data)

                            udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                            isACK(rcvpkt1) )

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                            Wait for ACK

                            0sender FSM

                            fragment

                            Wait for 0 from below

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                            has_seq1(rcvpkt))

                            udt_send(sndpkt)receiver FSM

                            fragment

                            Λ

                            3 Transport Layer 36Comp 361 Spring 2005

                            rdt30 channels with errors and loss

                            New assumptionunderlying channel can also lose packets (data or ACKs)

                            checksum seq ACKs retransmissions will be of help but not enough

                            Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                            Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                            retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                            requires countdown timer

                            3 Transport Layer 37Comp 361 Spring 2005

                            rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                            rdt_send(data)

                            Wait for

                            ACK0

                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                            Wait for call 1 from

                            above

                            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                            rdt_send(data)

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                            stop_timerstop_timer

                            udt_send(sndpkt)start_timer

                            timeout

                            udt_send(sndpkt)start_timer

                            timeout

                            rdt_rcv(rcvpkt)

                            Wait for call 0from

                            above

                            Wait for

                            ACK1

                            Λrdt_rcv(rcvpkt)

                            ΛΛ

                            Λ

                            3 Transport Layer 38Comp 361 Spring 2005

                            rdt30 in action

                            3 Transport Layer 39Comp 361 Spring 2005

                            rdt30 in action

                            3 Transport Layer 40Comp 361 Spring 2005

                            Performance of rdt30

                            rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                            L (packet length in bits)R (transmission rate bps)

                            8kbpkt109 bsec

                            Ttransmit = = = 8 microsec

                            U sender =

                            00830008

                            = 000027 L R RTT + L R

                            =

                            U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                            rdt30 stop-and-wait operation

                            first packet bit transmitted t = 0

                            sender receiver

                            RTT

                            last packet bit transmitted t = L R

                            first packet bit arriveslast packet bit arrives send ACK

                            ACK arrives send next packet t = RTT + L R

                            U sender =

                            008 30008

                            = 000027 L R RTT + L R

                            =

                            3 Transport Layer 41Comp 361 Spring 2005

                            3 Transport Layer 42Comp 361 Spring 2005

                            Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                            range of sequence numbers must be increasedbuffering at sender andor receiver

                            3 Transport Layer 43Comp 361 Spring 2005

                            Pipelined protocols

                            Advantage much better bandwidth utilization than stop-and-wait

                            Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                            Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                            Note TCP is not exactly either

                            Pipelining increased utilization

                            first packet bit transmitted t = 0

                            sender receiver

                            RTT

                            last bit transmitted t = L R

                            first packet bit arriveslast packet bit arrives send ACK

                            ACK arrives send next packet t = RTT + L R

                            last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                            U sender =

                            02430008

                            = 00008 3 L R RTT + L R

                            =

                            Increase utilizationby a factor of 3

                            3 Transport Layer 44Comp 361 Spring 2005

                            3 Transport Layer 45Comp 361 Spring 2005

                            Go-Back-NSender

                            k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                            ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                            Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                            3 Transport Layer 46Comp 361 Spring 2005

                            GBN Sender

                            rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                            Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                            Timeout resends ALL packets that have been sent but not yet acknowledged

                            This is only event that triggers resend

                            3 Transport Layer 47Comp 361 Spring 2005

                            GBN sender extended FSMrdt_send(data)

                            Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                            timeout

                            if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                            start_timernextseqnum++

                            elserefuse_data(data)

                            base = getacknum(rcvpkt)+1If (base == nextseqnum)

                            stop_timerelse

                            start_timer

                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                            base=1nextseqnum=1

                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                            Λ

                            3 Transport Layer 48Comp 361 Spring 2005

                            GBN receiver extended FSM

                            Wait

                            udt_send(sndpkt)default

                            rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                            expectedseqnum=1sndpkt =

                            make_pkt(0ACKchksum)

                            Λ

                            If expected packet receivedSend ACK and deliver packet upstairs

                            If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                            3 Transport Layer 49Comp 361 Spring 2005

                            More on receiver

                            The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                            3 Transport Layer 50Comp 361 Spring 2005

                            GBN inaction

                            GBN is easy to code but might have performance problems

                            In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                            Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                            3 Transport Layer 51Comp 361 Spring 2005

                            3 Transport Layer 52Comp 361 Spring 2005

                            Selective Repeat

                            receiver individually acknowledges all correctly received pkts

                            buffers pkts as needed for eventual in-order delivery to upper layer

                            sender only resends pkts for which ACK not received

                            sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                            sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                            3 Transport Layer 53Comp 361 Spring 2005

                            Selective repeat sender receiver windows

                            3 Transport Layer 54Comp 361 Spring 2005

                            Selective repeat

                            pkt n in [rcvbase rcvbase+N-1]

                            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                            pkt n in [rcvbase-Nrcvbase-1]

                            ACK(n) (note this is a reACK)

                            otherwiseignore

                            receiverdata from above

                            if next available seq in window send pkt

                            timeout(n)resend pkt n restart timer

                            ACK(n) in [sendbasesendbase+N]

                            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                            sender

                            3 Transport Layer 55Comp 361 Spring 2005

                            Selective repeat in action

                            3 Transport Layer 56Comp 361 Spring 2005

                            Selective repeatdilemma

                            Example seq rsquos 0 1 2 3window size=3

                            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                            Q what is relationship between seq size and window size

                            3 Transport Layer 57Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 58Comp 361 Spring 2005

                            TCP Overview RFCs 793 1122 1323 2018 2581

                            full duplex databi-directional data flow in same connectionMSS maximum segment size

                            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                            flow controlledsender will not overwhelm receiver

                            point-to-pointone sender one receiver

                            reliable in-order byte steam

                            no ldquomessage boundariesrdquopipelined

                            TCP congestion and flow control set window size

                            send amp receive buffers

                            socketdoor

                            TCPsend buffer

                            TCPreceive buffer

                            socketdoor

                            segment

                            applicationwrites data

                            applicationreads data

                            3 Transport Layer 59Comp 361 Spring 2005

                            More TCP DetailsMaximum Segment Size (MSS)

                            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                            Application Data + TCP Header = TCP Segment

                            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                            (again no payload)Client responds with third special segment

                            This can contain payload

                            3 Transport Layer 60Comp 361 Spring 2005

                            Even More TCP Details

                            A TCP connection between client and server creates in both client and server

                            (i) buffers(ii) variables and

                            (iii) a socket connection to process

                            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                            any of the network elements between the host and server

                            3 Transport Layer 61Comp 361 Spring 2005

                            TCP segment structure

                            source port dest port

                            32 bits

                            applicationdata

                            (variable length)

                            sequence numberacknowledgement number

                            Receive windowUrg data pnterchecksum

                            FSRPAUheadlen

                            notused

                            Options (variable length)

                            URG urgent data (generally not used)

                            ACK ACK valid

                            PSH push data now(generally not used)

                            RST SYN FINconnection estab(setup teardown

                            commands)

                            bytes rcvr willingto accept

                            Internetchecksum

                            (as in UDP)

                            countingby bytes of data(not segments)

                            3 Transport Layer 62Comp 361 Spring 2005

                            TCP seq rsquos and ACKsSeq rsquos

                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                            ACKsseq of next byte expected from other sidecumulative ACK

                            Q how receiver handles out-of-order segments

                            A TCP spec doesnrsquot say - up to implementer

                            Host BHost A

                            Seq=42 ACK=79 data = lsquoCrsquo

                            Seq=79 ACK=43 data = lsquoCrsquo

                            Seq=43 ACK=80

                            Usertypes

                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                            back lsquoCrsquo

                            host ACKsreceipt

                            of echoedlsquoCrsquo

                            timesimple telnet scenario

                            3 Transport Layer 63Comp 361 Spring 2005

                            TCP Round Trip Time and Timeout

                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                            average several recent measurements not just current SampleRTT

                            Q how to set TCP timeout valuelonger than RTT

                            but RTT variestoo short premature timeout

                            unnecessary retransmissions

                            too long slow reaction to segment loss

                            3 Transport Layer 64Comp 361 Spring 2005

                            TCP Round Trip Time and Timeout

                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                            3 Transport Layer 65Comp 361 Spring 2005

                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                            100

                            150

                            200

                            250

                            300

                            350

                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                            time (seconnds)

                            RTT

                            (mill

                            iseco

                            nds)

                            SampleRTT Estimated RTT

                            3 Transport Layer 66Comp 361 Spring 2005

                            TCP Round Trip Time and Timeout

                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                            (typically β = 025)

                            Then set timeout interval

                            TimeoutInterval = EstimatedRTT + 4DevRTT

                            3 Transport Layer 67Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 68Comp 361 Spring 2005

                            TCP reliable data transfer

                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                            Retransmissions are triggered by

                            timeout eventsduplicate acks

                            Initially consider simplified TCP sender

                            ignore duplicate acksignore flow control congestion control

                            3 Transport Layer 69Comp 361 Spring 2005

                            TCP sender eventsdata rcvd from app

                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                            timeoutretransmit segment that caused timeoutrestart timer

                            Ack rcvdIf acknowledges previously unackedsegments

                            update what is known to be ackedstart timer if there are outstanding segments

                            TCP sender(simplified)

                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                            loop (forever) switch(event)

                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                            event timer timeoutretransmit not-yet-acknowledged segment with

                            smallest sequence numberstart timer

                            event ACK received with ACK field value of y if (y gt SendBase)

                            SendBase = yif (there are currently not-yet-acknowledged segments)

                            start timer

                            end of loop forever

                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                            3 Transport Layer 70Comp 361 Spring 2005

                            3 Transport Layer 71Comp 361 Spring 2005

                            TCP retransmission scenariosHost A

                            Seq=100 20 bytes data

                            ACK=100

                            timepremature timeout

                            Host B

                            Seq=92 8 bytes data

                            ACK=120

                            Seq=92 8 bytes data

                            Seq=

                            92 t

                            imeo

                            ut

                            ACK=120

                            Host A

                            Seq=92 8 bytes data

                            ACK=100

                            loss

                            tim

                            eout

                            lost ACK scenario

                            Host B

                            X

                            Seq=92 8 bytes data

                            ACK=100

                            time

                            SendBase= 120

                            SendBase= 120

                            Sendbase= 100

                            Seq=

                            92 t

                            imeo

                            utSendBase

                            = 100

                            3 Transport Layer 72Comp 361 Spring 2005

                            TCP retransmission scenarios (more)Host A

                            Seq=92 8 bytes data

                            ACK=100

                            loss

                            tim

                            eout

                            Cumulative ACK scenario

                            Host B

                            X

                            Seq=100 20 bytes data

                            ACK=120

                            time

                            SendBase= 120

                            3 Transport Layer 73Comp 361 Spring 2005

                            TCP ACK generation [RFC 1122 RFC 2581]

                            Event at Receiver

                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                            Arrival of segment that partially or completely fills gap

                            TCP Receiver action

                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                            Immediately send single cumulative ACK ACKing both in-order segments

                            Immediately send duplicate ACK indicating seq of next expected byte

                            Immediate send ACK provided thatsegment starts at lower end of gap

                            3 Transport Layer 74Comp 361 Spring 2005

                            More on Sender Policies

                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                            3 Transport Layer 75Comp 361 Spring 2005

                            Fast Retransmit

                            Time-out period often relatively long

                            long delay before resending lost packet

                            Detect lost segments via duplicate ACKs

                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                            fast retransmit resend segment before timer expires

                            3 Transport Layer 76Comp 361 Spring 2005

                            Fast retransmit algorithm

                            event ACK received with ACK field value of y if (y gt SendBase)

                            SendBase = yif (there are currently not-yet-acknowledged segments)

                            start timer

                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                            resend segment with sequence number y

                            a duplicate ACK for already ACKed segment

                            fast retransmit

                            3 Transport Layer 77Comp 361 Spring 2005

                            TCP GBN or Selective Repeat

                            Basic TCP looks a lot like GBN

                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                            This looks a lot like Selective Repeat

                            TCP is a hybrid

                            3 Transport Layer 78Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 79Comp 361 Spring 2005

                            TCP Flow Control

                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                            3 Transport Layer 80Comp 361 Spring 2005

                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                            transmitting too muchtoo fast

                            flow controlreceive side of TCP connection has a receive buffer

                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                            app process may be slow at reading from buffer

                            3 Transport Layer 81Comp 361 Spring 2005

                            TCP segment structure

                            source port dest port

                            32 bits

                            applicationdata

                            (variable length)

                            sequence numberacknowledgement number

                            Receive windowUrg data pnterchecksum

                            FSRPAUheadlen

                            notused

                            Options (variable length)

                            URG urgent data (generally not used)

                            ACK ACK valid

                            PSH push data now(generally not used)

                            RST SYN FINconnection estab(setup teardown

                            commands)

                            bytes rcvr willingto accept

                            Internetchecksum

                            (as in UDP)

                            countingby bytes of data(not segments)

                            3 Transport Layer 82Comp 361 Spring 2005

                            TCP Flow control how it works

                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                            LastByteRead]

                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                            guarantees receive buffer doesnrsquot overflow

                            3 Transport Layer 83Comp 361 Spring 2005

                            Technical Issue

                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                            3 Transport Layer 84Comp 361 Spring 2005

                            Note on UDP

                            UDP has no flow control

                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                            3 Transport Layer 85Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 86Comp 361 Spring 2005

                            TCP Connection Management

                            Three way handshakeStep 1 client end system sends

                            TCP SYN control segment to server

                            specifies client_isn the initial seq No application data

                            Step 2 server end system receives SYN replies with SYNACK control segment

                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                            seq sbuffers flow control info (eg RcvWindow)

                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                            3 Transport Layer 87Comp 361 Spring 2005

                            TCP Connection Management (cont)

                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                            Allocate buffersAllocates buffersCan include application data

                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                            clientConnection request (SYN=1 seq=client_isn)

                            server

                            Connection granted (SYN=1 server_isn

                            ACK (SYN=0 seq=client_isn+1)

                            ack=client_isn+1)

                            ack=server_isn+1

                            3 Transport Layer 88Comp 361 Spring 2005

                            TCP Connection Management (cont)

                            Closing a connection

                            client closes socketclientSocketclose()

                            Step 1 client end system sends TCP FIN control segment to server

                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                            client

                            FIN

                            server

                            ACK

                            ACK

                            FIN

                            close

                            close

                            closed

                            tim

                            ed w

                            ait

                            3 Transport Layer 89Comp 361 Spring 2005

                            TCP Connection Management (cont)

                            Step 3 client receives FIN replies with ACK

                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                            Closes down after timed-wait

                            Step 4 server receives ACK Connection closed

                            Note with small modification can handle simultaneous FINs

                            client

                            FIN

                            server

                            ACK

                            ACK

                            FIN

                            closing

                            closing

                            closed

                            tim

                            ed w

                            ait

                            closed

                            3 Transport Layer 90Comp 361 Spring 2005

                            TCP Connection Management (cont)

                            ExampleTCP serverlifecycle

                            Example TCP clientlifecycle

                            3 Transport Layer 91Comp 361 Spring 2005

                            A few special cases

                            Have not discussed what happens if both client and server decide to close down connection at same time

                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                            3 Transport Layer 92Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 93Comp 361 Spring 2005

                            Principles of Congestion Control

                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                            a top-10 problem

                            3 Transport Layer 94Comp 361 Spring 2005

                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                            large delays when congestedmaximum achievable throughput

                            3 Transport Layer 95Comp 361 Spring 2005

                            Causescosts of congestion scenario 2

                            one router finite buffers sender retransmission of lost packet

                            3 Transport Layer 96Comp 361 Spring 2005

                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                            λin λout=

                            λin λoutgtλ

                            inλout

                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                            (c)(a) (b)

                            3 Transport Layer 97Comp 361 Spring 2005

                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                            λin

                            Q what happens as and increase λ

                            in

                            3 Transport Layer 98Comp 361 Spring 2005

                            Causescosts of congestion scenario 3

                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                            3 Transport Layer 99Comp 361 Spring 2005

                            Approaches towards congestion control

                            Two broad approaches towards congestion control

                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                            Network-assisted congestion controlrouters provide feedback to end systems

                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                            3 Transport Layer 100Comp 361 Spring 2005

                            Case study ATM ABR congestion control

                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                            RM cells returned to sender by receiver with bits intact

                            small exception ndash see next page

                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                            sender should use available bandwidth

                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                            3 Transport Layer 101Comp 361 Spring 2005

                            Case study ATM ABR congestion control

                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                            3 Transport Layer 102Comp 361 Spring 2005

                            Chapter 3 outline

                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                            35 Connection-oriented transport TCP

                            segment structurereliable data transferflow controlconnection management

                            36 Principles of congestion control37 TCP congestion control

                            3 Transport Layer 103Comp 361 Spring 2005

                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                            Congwin

                            w segments each with MSS bytes sent in one RTT

                            throughput = w MSSRTT Bytessec

                            3 Transport Layer 104Comp 361 Spring 2005

                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                            LastByteSent-LastByteAcked le CongWin

                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                            3 Transport Layer 105Comp 361 Spring 2005

                            TCP AIMDmultiplicative decrease additive increase increase

                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                            cut CongWin in half after loss event

                            8 Kbytes

                            16 Kbytes

                            24 Kbytes

                            time

                            congestionwindow

                            Long-lived TCP connection

                            3 Transport Layer 106Comp 361 Spring 2005

                            TCP Slow Start

                            When connection begins CongWin = 1 MSS

                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                            available bandwidth may be gtgt MSSRTT

                            desirable to quickly ramp up to respectable rate

                            When connection begins increase rate exponentially fast until first loss event

                            3 Transport Layer 107Comp 361 Spring 2005

                            TCP Slow Start (more)

                            When connection begins increase rate exponentially until first loss event

                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                            Summary initial rate is slow but ramps up exponentially fast

                            Host A

                            one segment

                            RTT

                            Host B

                            time

                            two segments

                            four segments

                            3 Transport Layer 108Comp 361 Spring 2005

                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                            3 Transport Layer 109Comp 361 Spring 2005

                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                            3 Transport Layer 110Comp 361 Spring 2005

                            Summary TCP Congestion Control

                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                            3 Transport Layer 111Comp 361 Spring 2005

                            The Big Picture

                            3 Transport Layer 112Comp 361 Spring 2005

                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                            ACK receipt for previously unackeddata

                            Slow Start (SS)

                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                            set state to ldquoCongestion Avoidancerdquo

                            Resulting in a doubling of CongWin every RTT

                            ACK receipt for previously unackeddata

                            CongestionAvoidance (CA)

                            CongWin = CongWin+MSS (MSSCongWin)

                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                            Loss event detected by triple duplicate ACK

                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                            Enter slow start

                            Duplicate ACK

                            SS or CA Increment duplicate ACK count for segment being acked

                            CongWin and Threshold not changed

                            3 Transport Layer 113Comp 361 Spring 2005

                            TCP throughput

                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                            3 Transport Layer 114Comp 361 Spring 2005

                            TCP Futures

                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                            L = 210-10 WowNew versions of TCP for high-speed needed

                            LRTTMSSsdot221

                            3 Transport Layer 115Comp 361 Spring 2005

                            TCP FairnessFairness goal if K TCP sessions share same

                            bottleneck link of bandwidth R each should have average rate of RK

                            TCP connection 1

                            bottleneckrouter

                            capacity R

                            TCP connection 2

                            3 Transport Layer 116Comp 361 Spring 2005

                            Why is TCP fairTwo competing sessions

                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                            R

                            R

                            equal bandwidth share

                            Connection 1 throughput

                            Conn

                            ecti

                            on 2

                            thr

                            ough

                            p ut

                            congestion avoidance additive increaseloss decrease window by factor of 2

                            congestion avoidance additive increaseloss decrease window by factor of 2

                            3 Transport Layer 117Comp 361 Spring 2005

                            Fairness (more)Fairness and UDP

                            Multimedia apps often do not use TCP

                            do not want rate throttled by congestion control

                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                            Current Research area How to keep UDP from congesting the internet

                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                            3 Transport Layer 118Comp 361 Spring 2005

                            TCP Latency ModelingNotation assumptions

                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                            modeling slow start

                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                            3 Transport Layer 119Comp 361 Spring 2005

                            Fixed Congestion Window (W)Two cases

                            1 WSR gt RTT + SR ACK for first segment in window returns before

                            windowrsquos worth of data sentLatency = 2RTT + OR

                            2 WSR lt RTT + SR ACK for first segment in window returns after

                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                            3 Transport Layer 120Comp 361 Spring 2005

                            Fixed congestion window (1)

                            First caseWSR gt RTT + SR ACK for

                            first segment in window returns before windowrsquos worth of data sent

                            latency = 2RTT + OR

                            3 Transport Layer 121Comp 361 Spring 2005

                            Fixed congestion window (2)

                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                            3 Transport Layer 122Comp 361 Spring 2005

                            TCP Latency Modeling Slow Start (1)

                            Now suppose window grows according to slow start(with no threshold and no loss events)

                            Will show that the delay for one object is

                            RS

                            RSRTTP

                            RORTTLatency P )12(2 minusminus⎥⎦

                            ⎤⎢⎣⎡ +++=

                            where P is the number of times TCP idles at server1min minus= KQP

                            - where Q is the number of times the server idlesif the object were of infinite size

                            - and K is the number of windows that cover the object

                            3 Transport Layer 123Comp 361 Spring 2005

                            TCP Latency Modeling Slow Start (2)

                            RTT

                            initiate TCPconnection

                            requestobject

                            first window= SR

                            second window= 2SR

                            third window= 4SR

                            fourth window= 8SR

                            completetransmissionobject

                            delivered

                            time atclient

                            time atserver

                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                            Server idles P=2 times

                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                            Server idles P = minK-1Q times

                            3 Transport Layer 124Comp 361 Spring 2005

                            TCP Latency Modeling (3)

                            ementacknowledg receivesserver until

                            segment send tostartsserver whenfrom time=+ RTTRS

                            RS

                            RSRTTPRTT

                            RO

                            RSRTT

                            RSRTT

                            RO

                            idleTimeRTTRO

                            P

                            kP

                            k

                            P

                            pp

                            )12(][2

                            ]2[2

                            2delay

                            1

                            1

                            1

                            minusminus+++=

                            minus+++=

                            ++=

                            minus

                            =

                            =

                            sum

                            sum

                            th window after the timeidle 2 1 kRSRTT

                            RS k =⎥⎦

                            ⎤⎢⎣⎡ minus+

                            +minus

                            window kth the transmit totime2 1 =minus

                            RSk

                            RTT

                            initiate TCPconnection

                            requestobject

                            first window= SR

                            second window= 2SR

                            third window= 4SR

                            fourth window= 8SR

                            completetransmissionobject

                            delivered

                            time atclient

                            time atserver

                            3 Transport Layer 125Comp 361 Spring 2005

                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                            How do we calculate K

                            ⎥⎥⎤

                            ⎢⎢⎡ +=

                            +ge=

                            geminus=

                            ge+++=

                            ge+++=minus

                            minus

                            )1(log

                            )1(logmin

                            12min

                            222min222min

                            2

                            2

                            110

                            110

                            SO

                            SOkk

                            SOk

                            SOkOSSSkK

                            k

                            k

                            k

                            L

                            L

                            Calculation of Q number of idles for infinite-size objectis similar

                            3 Transport Layer 126Comp 361 Spring 2005

                            HTTP ModelingAssume Web page consists of

                            1 base HTML page (of size O bits)M images (each of size O bits)

                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                            3 Transport Layer 127Comp 361 Spring 2005

                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                            02468

                            101214161820

                            28Kbps

                            100Kbps

                            1 Mbps 10Mbps

                            non-persistent

                            persistent

                            parallel non-persistent

                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                            3 Transport Layer 128Comp 361 Spring 2005

                            HTTP Response time (in seconds)

                            0

                            10

                            20

                            30

                            40

                            50

                            60

                            70

                            28Kbps

                            100Kbps

                            1 Mbps 10Mbps

                            non-persistent

                            persistent

                            parallel non-persistent

                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                            3 Transport Layer 129Comp 361 Spring 2005

                            Chapter 3 Summaryprinciples behind transport layer services

                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                            instantiation and implementation in the Internet

                            UDPTCP

                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                            • Chapter 3 Transport Layer last revised 160305
                            • Chapter 3 outline
                            • Transport services and protocols
                            • Transport vs network layer
                            • Transport-layer protocols
                            • Chapter 3 outline
                            • Multiplexingdemultiplexing
                            • Multiplexingdemultiplexing
                            • How demultiplexing works
                            • Connectionless demultiplexing
                            • Connectionless demux (cont)
                            • Connection-oriented demux
                            • Connection-oriented demux (cont)
                            • Connection-oriented demux Threaded Web Server
                            • Chapter 3 outline
                            • UDP User Datagram Protocol [RFC 768]
                            • UDP more
                            • UDP checksum
                            • Chapter 3 outline
                            • Principles of Reliable data transfer
                            • Reliable data transfer getting started
                            • Reliable data transfer getting started
                            • Incremental Improvements
                            • Rdt10 reliable transfer over a reliable channel
                            • Rdt20 channel with bit errors
                            • rdt20 FSM specification
                            • rdt20 operation with no errors
                            • rdt20 error scenario
                            • rdt20 has a fatal flaw
                            • rdt21 sender handles garbled ACKNAKs
                            • rdt21 receiver handles garbled ACKNAKs
                            • rdt21 discussion
                            • rdt22 a NAK-free protocol
                            • rdt22 sender receiver fragments
                            • rdt30 channels with errors and loss
                            • rdt30 sender
                            • rdt30 in action
                            • rdt30 in action
                            • Performance of rdt30
                            • rdt30 stop-and-wait operation
                            • Pipelined protocols
                            • Pipelined protocols
                            • Pipelining increased utilization
                            • Go-Back-N
                            • GBN Sender
                            • GBN sender extended FSM
                            • GBN receiver extended FSM
                            • More on receiver
                            • GBN inaction
                            • Selective Repeat
                            • Selective repeat sender receiver windows
                            • Selective repeat
                            • Selective repeat in action
                            • Selective repeat dilemma
                            • Chapter 3 outline
                            • TCP Overview RFCs 793 1122 1323 2018 2581
                            • More TCP Details
                            • Even More TCP Details
                            • TCP segment structure
                            • TCP seq rsquos and ACKs
                            • TCP Round Trip Time and Timeout
                            • TCP Round Trip Time and Timeout
                            • Example RTT estimation
                            • TCP Round Trip Time and Timeout
                            • Chapter 3 outline
                            • TCP reliable data transfer
                            • TCP sender events
                            • TCP sender(simplified)
                            • TCP retransmission scenarios
                            • TCP retransmission scenarios (more)
                            • TCP ACK generation [RFC 1122 RFC 2581]
                            • More on Sender Policies
                            • Fast Retransmit
                            • Fast retransmit algorithm
                            • TCP GBN or Selective Repeat
                            • Chapter 3 outline
                            • TCP Flow Control
                            • TCP Flow Control
                            • TCP segment structure
                            • TCP Flow control how it works
                            • Technical Issue
                            • Chapter 3 outline
                            • TCP Connection Management
                            • TCP Connection Management (cont)
                            • TCP Connection Management (cont)
                            • TCP Connection Management (cont)
                            • TCP Connection Management (cont)
                            • A few special cases
                            • Chapter 3 outline
                            • Principles of Congestion Control
                            • Causescosts of congestion scenario 1
                            • Causescosts of congestion scenario 2
                            • Causescosts of congestion scenario 3
                            • Causescosts of congestion scenario 3
                            • Approaches towards congestion control
                            • Case study ATM ABR congestion control
                            • Case study ATM ABR congestion control
                            • Chapter 3 outline
                            • TCP Congestion Control
                            • TCP AIMD
                            • TCP Slow Start
                            • TCP Slow Start (more)
                            • Summary TCP Congestion Control
                            • The Big Picture
                            • TCP sender congestion control
                            • TCP throughput
                            • TCP Futures
                            • TCP Fairness
                            • Why is TCP fair
                            • Fairness (more)
                            • TCP Latency Modeling
                            • Fixed Congestion Window (W)
                            • Fixed congestion window (1)
                            • Fixed congestion window (2)
                            • TCP Latency Modeling Slow Start (1)
                            • TCP Latency Modeling Slow Start (2)
                            • TCP Latency Modeling (3)
                            • TCP Latency Modeling (4)
                            • HTTP Modeling
                            • Chapter 3 Summary

                              3 Transport Layer 15Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 16Comp 361 Spring 2005

                              UDP User Datagram Protocol [RFC 768]

                              ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                              lostdelivered out of order to app

                              connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                              Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                              3 Transport Layer 17Comp 361 Spring 2005

                              UDP moreoften used for streaming multimedia apps

                              loss tolerantrate sensitive

                              other UDP uses (why)

                              DNS small delaySNMP stressful cond

                              reliable transfer over UDP add reliability at application layer

                              application-specific error recover

                              source port dest port

                              32 bits

                              Applicationdata

                              (message)

                              length checksumLength in

                              bytes of UDPsegmentincluding

                              header

                              UDP segment format

                              3 Transport Layer 18Comp 361 Spring 2005

                              UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                              segment

                              Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                              NO - error detectedYES - no error detected But maybe errors nonetheless More later

                              Receiver may choose to discard segment or send a warning to app in case error

                              Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                              3 Transport Layer 19Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 20Comp 361 Spring 2005

                              Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                              characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                              3 Transport Layer 21Comp 361 Spring 2005

                              Reliable data transfer getting started

                              sendside

                              receiveside

                              rdt_send() called from above (eg by app) Passed data to

                              deliver to receiver upper layer

                              udt_send() called by rdtto transfer packet over

                              unreliable channel to receiver

                              rdt_rcv() called when packet arrives on rcv-side of channel

                              deliver_data() called by rdt to deliver data to upper

                              3 Transport Layer 22Comp 361 Spring 2005

                              Reliable data transfer getting startedWersquoll

                              incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                              but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                              state1

                              state2

                              event causing state transitionactions taken on state transition

                              state when in this ldquostaterdquo next state

                              uniquely determined by next event

                              eventactions

                              3 Transport Layer 23Comp 361 Spring 2005

                              Incremental Improvements

                              rdt10 assumes every packet sent arrives and no errors introduced in transmission

                              rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                              rdt21 deals with corrupted ACKSNAKS

                              rdt22 like rdt21 but does not need NAKs

                              Rdt30 Allows packets to be lost

                              Rdt10 reliable transfer over a reliable channel

                              underlying channel perfectly reliableno bit errorsno loss of packets

                              separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                              Wait for call from above packet = make_pkt(data)

                              udt_send(packet)

                              rdt_send(data)extract (packetdata)deliver_data(data)

                              Wait for call from

                              below

                              rdt_rcv(packet)

                              sender receiver

                              3 Transport Layer 24Comp 361 Spring 2005

                              3 Transport Layer 25Comp 361 Spring 2005

                              Rdt20 channel with bit errors

                              underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                              the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                              new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                              3 Transport Layer 26Comp 361 Spring 2005

                              rdt20 FSM specification

                              Wait for call from above

                              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                              udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                              udt_send(NAK)

                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                              Wait for ACK or

                              NAK

                              rdt_send(data)

                              receiver

                              Wait for call from

                              below

                              Λ

                              sender

                              3 Transport Layer 27Comp 361 Spring 2005

                              rdt20 operation with no errors

                              Wait for call from above

                              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                              udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                              udt_send(NAK)

                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                              Wait for ACK or

                              NAK

                              Wait for call from

                              below

                              rdt_send(data)

                              Λ

                              3 Transport Layer 28Comp 361 Spring 2005

                              rdt20 error scenario

                              Wait for call from above

                              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                              udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                              udt_send(NAK)

                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                              Wait for ACK or

                              NAK

                              Wait for call from

                              below

                              rdt_send(data)

                              Λ

                              3 Transport Layer 29Comp 361 Spring 2005

                              rdt20 has a fatal flawWhat happens if ACKNAK

                              corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                              What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                              Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                              Sender sends one packet then waits for receiver response

                              stop and wait

                              3 Transport Layer 30Comp 361 Spring 2005

                              Sender whenever sender receives control message it sends a packet to receiver

                              A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                              Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                              Note ACKNAK do not contain sequence

                              3 Transport Layer 31Comp 361 Spring 2005

                              rdt21 sender handles garbled ACKNAKs

                              Wait for call 0 from

                              above

                              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                              rdt_send(data)

                              Wait for ACK or NAK 0 udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                              rdt_send(data)

                              udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                              Wait forcall 1 from

                              above

                              Wait for ACK or NAK 1

                              ΛΛ

                              3 Transport Layer 32Comp 361 Spring 2005

                              rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                              ampamp has_seq0(rcvpkt)

                              Wait for 0 from below

                              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                              Wait for 1 from below

                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                              3 Transport Layer 33Comp 361 Spring 2005

                              rdt21 discussion

                              Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                              state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                              Receivermust check if received packet is duplicate

                              state indicates whether 0 or 1 is expected pkt seq

                              note receiver can notknow if its last ACKNAK received OK at sender

                              3 Transport Layer 34Comp 361 Spring 2005

                              rdt22 a NAK-free protocol

                              same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                              receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                              duplicate ACK at sender results in same action as NAK retransmit current pkt

                              3 Transport Layer 35Comp 361 Spring 2005

                              rdt22 sender receiver fragments

                              Wait for call 0 from

                              above

                              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                              rdt_send(data)

                              udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                              isACK(rcvpkt1) )

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                              Wait for ACK

                              0sender FSM

                              fragment

                              Wait for 0 from below

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                              has_seq1(rcvpkt))

                              udt_send(sndpkt)receiver FSM

                              fragment

                              Λ

                              3 Transport Layer 36Comp 361 Spring 2005

                              rdt30 channels with errors and loss

                              New assumptionunderlying channel can also lose packets (data or ACKs)

                              checksum seq ACKs retransmissions will be of help but not enough

                              Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                              Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                              retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                              requires countdown timer

                              3 Transport Layer 37Comp 361 Spring 2005

                              rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                              rdt_send(data)

                              Wait for

                              ACK0

                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                              Wait for call 1 from

                              above

                              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                              rdt_send(data)

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                              stop_timerstop_timer

                              udt_send(sndpkt)start_timer

                              timeout

                              udt_send(sndpkt)start_timer

                              timeout

                              rdt_rcv(rcvpkt)

                              Wait for call 0from

                              above

                              Wait for

                              ACK1

                              Λrdt_rcv(rcvpkt)

                              ΛΛ

                              Λ

                              3 Transport Layer 38Comp 361 Spring 2005

                              rdt30 in action

                              3 Transport Layer 39Comp 361 Spring 2005

                              rdt30 in action

                              3 Transport Layer 40Comp 361 Spring 2005

                              Performance of rdt30

                              rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                              L (packet length in bits)R (transmission rate bps)

                              8kbpkt109 bsec

                              Ttransmit = = = 8 microsec

                              U sender =

                              00830008

                              = 000027 L R RTT + L R

                              =

                              U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                              rdt30 stop-and-wait operation

                              first packet bit transmitted t = 0

                              sender receiver

                              RTT

                              last packet bit transmitted t = L R

                              first packet bit arriveslast packet bit arrives send ACK

                              ACK arrives send next packet t = RTT + L R

                              U sender =

                              008 30008

                              = 000027 L R RTT + L R

                              =

                              3 Transport Layer 41Comp 361 Spring 2005

                              3 Transport Layer 42Comp 361 Spring 2005

                              Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                              range of sequence numbers must be increasedbuffering at sender andor receiver

                              3 Transport Layer 43Comp 361 Spring 2005

                              Pipelined protocols

                              Advantage much better bandwidth utilization than stop-and-wait

                              Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                              Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                              Note TCP is not exactly either

                              Pipelining increased utilization

                              first packet bit transmitted t = 0

                              sender receiver

                              RTT

                              last bit transmitted t = L R

                              first packet bit arriveslast packet bit arrives send ACK

                              ACK arrives send next packet t = RTT + L R

                              last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                              U sender =

                              02430008

                              = 00008 3 L R RTT + L R

                              =

                              Increase utilizationby a factor of 3

                              3 Transport Layer 44Comp 361 Spring 2005

                              3 Transport Layer 45Comp 361 Spring 2005

                              Go-Back-NSender

                              k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                              ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                              Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                              3 Transport Layer 46Comp 361 Spring 2005

                              GBN Sender

                              rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                              Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                              Timeout resends ALL packets that have been sent but not yet acknowledged

                              This is only event that triggers resend

                              3 Transport Layer 47Comp 361 Spring 2005

                              GBN sender extended FSMrdt_send(data)

                              Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                              timeout

                              if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                              start_timernextseqnum++

                              elserefuse_data(data)

                              base = getacknum(rcvpkt)+1If (base == nextseqnum)

                              stop_timerelse

                              start_timer

                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                              base=1nextseqnum=1

                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                              Λ

                              3 Transport Layer 48Comp 361 Spring 2005

                              GBN receiver extended FSM

                              Wait

                              udt_send(sndpkt)default

                              rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                              expectedseqnum=1sndpkt =

                              make_pkt(0ACKchksum)

                              Λ

                              If expected packet receivedSend ACK and deliver packet upstairs

                              If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                              3 Transport Layer 49Comp 361 Spring 2005

                              More on receiver

                              The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                              3 Transport Layer 50Comp 361 Spring 2005

                              GBN inaction

                              GBN is easy to code but might have performance problems

                              In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                              Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                              3 Transport Layer 51Comp 361 Spring 2005

                              3 Transport Layer 52Comp 361 Spring 2005

                              Selective Repeat

                              receiver individually acknowledges all correctly received pkts

                              buffers pkts as needed for eventual in-order delivery to upper layer

                              sender only resends pkts for which ACK not received

                              sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                              sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                              3 Transport Layer 53Comp 361 Spring 2005

                              Selective repeat sender receiver windows

                              3 Transport Layer 54Comp 361 Spring 2005

                              Selective repeat

                              pkt n in [rcvbase rcvbase+N-1]

                              send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                              pkt n in [rcvbase-Nrcvbase-1]

                              ACK(n) (note this is a reACK)

                              otherwiseignore

                              receiverdata from above

                              if next available seq in window send pkt

                              timeout(n)resend pkt n restart timer

                              ACK(n) in [sendbasesendbase+N]

                              mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                              sender

                              3 Transport Layer 55Comp 361 Spring 2005

                              Selective repeat in action

                              3 Transport Layer 56Comp 361 Spring 2005

                              Selective repeatdilemma

                              Example seq rsquos 0 1 2 3window size=3

                              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                              Q what is relationship between seq size and window size

                              3 Transport Layer 57Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 58Comp 361 Spring 2005

                              TCP Overview RFCs 793 1122 1323 2018 2581

                              full duplex databi-directional data flow in same connectionMSS maximum segment size

                              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                              flow controlledsender will not overwhelm receiver

                              point-to-pointone sender one receiver

                              reliable in-order byte steam

                              no ldquomessage boundariesrdquopipelined

                              TCP congestion and flow control set window size

                              send amp receive buffers

                              socketdoor

                              TCPsend buffer

                              TCPreceive buffer

                              socketdoor

                              segment

                              applicationwrites data

                              applicationreads data

                              3 Transport Layer 59Comp 361 Spring 2005

                              More TCP DetailsMaximum Segment Size (MSS)

                              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                              Application Data + TCP Header = TCP Segment

                              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                              (again no payload)Client responds with third special segment

                              This can contain payload

                              3 Transport Layer 60Comp 361 Spring 2005

                              Even More TCP Details

                              A TCP connection between client and server creates in both client and server

                              (i) buffers(ii) variables and

                              (iii) a socket connection to process

                              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                              any of the network elements between the host and server

                              3 Transport Layer 61Comp 361 Spring 2005

                              TCP segment structure

                              source port dest port

                              32 bits

                              applicationdata

                              (variable length)

                              sequence numberacknowledgement number

                              Receive windowUrg data pnterchecksum

                              FSRPAUheadlen

                              notused

                              Options (variable length)

                              URG urgent data (generally not used)

                              ACK ACK valid

                              PSH push data now(generally not used)

                              RST SYN FINconnection estab(setup teardown

                              commands)

                              bytes rcvr willingto accept

                              Internetchecksum

                              (as in UDP)

                              countingby bytes of data(not segments)

                              3 Transport Layer 62Comp 361 Spring 2005

                              TCP seq rsquos and ACKsSeq rsquos

                              byte stream ldquonumberrdquo of first byte in segmentrsquos data

                              ACKsseq of next byte expected from other sidecumulative ACK

                              Q how receiver handles out-of-order segments

                              A TCP spec doesnrsquot say - up to implementer

                              Host BHost A

                              Seq=42 ACK=79 data = lsquoCrsquo

                              Seq=79 ACK=43 data = lsquoCrsquo

                              Seq=43 ACK=80

                              Usertypes

                              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                              back lsquoCrsquo

                              host ACKsreceipt

                              of echoedlsquoCrsquo

                              timesimple telnet scenario

                              3 Transport Layer 63Comp 361 Spring 2005

                              TCP Round Trip Time and Timeout

                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                              average several recent measurements not just current SampleRTT

                              Q how to set TCP timeout valuelonger than RTT

                              but RTT variestoo short premature timeout

                              unnecessary retransmissions

                              too long slow reaction to segment loss

                              3 Transport Layer 64Comp 361 Spring 2005

                              TCP Round Trip Time and Timeout

                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                              3 Transport Layer 65Comp 361 Spring 2005

                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                              100

                              150

                              200

                              250

                              300

                              350

                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                              time (seconnds)

                              RTT

                              (mill

                              iseco

                              nds)

                              SampleRTT Estimated RTT

                              3 Transport Layer 66Comp 361 Spring 2005

                              TCP Round Trip Time and Timeout

                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                              (typically β = 025)

                              Then set timeout interval

                              TimeoutInterval = EstimatedRTT + 4DevRTT

                              3 Transport Layer 67Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 68Comp 361 Spring 2005

                              TCP reliable data transfer

                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                              Retransmissions are triggered by

                              timeout eventsduplicate acks

                              Initially consider simplified TCP sender

                              ignore duplicate acksignore flow control congestion control

                              3 Transport Layer 69Comp 361 Spring 2005

                              TCP sender eventsdata rcvd from app

                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                              timeoutretransmit segment that caused timeoutrestart timer

                              Ack rcvdIf acknowledges previously unackedsegments

                              update what is known to be ackedstart timer if there are outstanding segments

                              TCP sender(simplified)

                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                              loop (forever) switch(event)

                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                              event timer timeoutretransmit not-yet-acknowledged segment with

                              smallest sequence numberstart timer

                              event ACK received with ACK field value of y if (y gt SendBase)

                              SendBase = yif (there are currently not-yet-acknowledged segments)

                              start timer

                              end of loop forever

                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                              3 Transport Layer 70Comp 361 Spring 2005

                              3 Transport Layer 71Comp 361 Spring 2005

                              TCP retransmission scenariosHost A

                              Seq=100 20 bytes data

                              ACK=100

                              timepremature timeout

                              Host B

                              Seq=92 8 bytes data

                              ACK=120

                              Seq=92 8 bytes data

                              Seq=

                              92 t

                              imeo

                              ut

                              ACK=120

                              Host A

                              Seq=92 8 bytes data

                              ACK=100

                              loss

                              tim

                              eout

                              lost ACK scenario

                              Host B

                              X

                              Seq=92 8 bytes data

                              ACK=100

                              time

                              SendBase= 120

                              SendBase= 120

                              Sendbase= 100

                              Seq=

                              92 t

                              imeo

                              utSendBase

                              = 100

                              3 Transport Layer 72Comp 361 Spring 2005

                              TCP retransmission scenarios (more)Host A

                              Seq=92 8 bytes data

                              ACK=100

                              loss

                              tim

                              eout

                              Cumulative ACK scenario

                              Host B

                              X

                              Seq=100 20 bytes data

                              ACK=120

                              time

                              SendBase= 120

                              3 Transport Layer 73Comp 361 Spring 2005

                              TCP ACK generation [RFC 1122 RFC 2581]

                              Event at Receiver

                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                              Arrival of segment that partially or completely fills gap

                              TCP Receiver action

                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                              Immediately send single cumulative ACK ACKing both in-order segments

                              Immediately send duplicate ACK indicating seq of next expected byte

                              Immediate send ACK provided thatsegment starts at lower end of gap

                              3 Transport Layer 74Comp 361 Spring 2005

                              More on Sender Policies

                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                              3 Transport Layer 75Comp 361 Spring 2005

                              Fast Retransmit

                              Time-out period often relatively long

                              long delay before resending lost packet

                              Detect lost segments via duplicate ACKs

                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                              fast retransmit resend segment before timer expires

                              3 Transport Layer 76Comp 361 Spring 2005

                              Fast retransmit algorithm

                              event ACK received with ACK field value of y if (y gt SendBase)

                              SendBase = yif (there are currently not-yet-acknowledged segments)

                              start timer

                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                              resend segment with sequence number y

                              a duplicate ACK for already ACKed segment

                              fast retransmit

                              3 Transport Layer 77Comp 361 Spring 2005

                              TCP GBN or Selective Repeat

                              Basic TCP looks a lot like GBN

                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                              This looks a lot like Selective Repeat

                              TCP is a hybrid

                              3 Transport Layer 78Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 79Comp 361 Spring 2005

                              TCP Flow Control

                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                              3 Transport Layer 80Comp 361 Spring 2005

                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                              transmitting too muchtoo fast

                              flow controlreceive side of TCP connection has a receive buffer

                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                              app process may be slow at reading from buffer

                              3 Transport Layer 81Comp 361 Spring 2005

                              TCP segment structure

                              source port dest port

                              32 bits

                              applicationdata

                              (variable length)

                              sequence numberacknowledgement number

                              Receive windowUrg data pnterchecksum

                              FSRPAUheadlen

                              notused

                              Options (variable length)

                              URG urgent data (generally not used)

                              ACK ACK valid

                              PSH push data now(generally not used)

                              RST SYN FINconnection estab(setup teardown

                              commands)

                              bytes rcvr willingto accept

                              Internetchecksum

                              (as in UDP)

                              countingby bytes of data(not segments)

                              3 Transport Layer 82Comp 361 Spring 2005

                              TCP Flow control how it works

                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                              LastByteRead]

                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                              guarantees receive buffer doesnrsquot overflow

                              3 Transport Layer 83Comp 361 Spring 2005

                              Technical Issue

                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                              3 Transport Layer 84Comp 361 Spring 2005

                              Note on UDP

                              UDP has no flow control

                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                              3 Transport Layer 85Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 86Comp 361 Spring 2005

                              TCP Connection Management

                              Three way handshakeStep 1 client end system sends

                              TCP SYN control segment to server

                              specifies client_isn the initial seq No application data

                              Step 2 server end system receives SYN replies with SYNACK control segment

                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                              seq sbuffers flow control info (eg RcvWindow)

                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                              3 Transport Layer 87Comp 361 Spring 2005

                              TCP Connection Management (cont)

                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                              Allocate buffersAllocates buffersCan include application data

                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                              clientConnection request (SYN=1 seq=client_isn)

                              server

                              Connection granted (SYN=1 server_isn

                              ACK (SYN=0 seq=client_isn+1)

                              ack=client_isn+1)

                              ack=server_isn+1

                              3 Transport Layer 88Comp 361 Spring 2005

                              TCP Connection Management (cont)

                              Closing a connection

                              client closes socketclientSocketclose()

                              Step 1 client end system sends TCP FIN control segment to server

                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                              client

                              FIN

                              server

                              ACK

                              ACK

                              FIN

                              close

                              close

                              closed

                              tim

                              ed w

                              ait

                              3 Transport Layer 89Comp 361 Spring 2005

                              TCP Connection Management (cont)

                              Step 3 client receives FIN replies with ACK

                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                              Closes down after timed-wait

                              Step 4 server receives ACK Connection closed

                              Note with small modification can handle simultaneous FINs

                              client

                              FIN

                              server

                              ACK

                              ACK

                              FIN

                              closing

                              closing

                              closed

                              tim

                              ed w

                              ait

                              closed

                              3 Transport Layer 90Comp 361 Spring 2005

                              TCP Connection Management (cont)

                              ExampleTCP serverlifecycle

                              Example TCP clientlifecycle

                              3 Transport Layer 91Comp 361 Spring 2005

                              A few special cases

                              Have not discussed what happens if both client and server decide to close down connection at same time

                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                              3 Transport Layer 92Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 93Comp 361 Spring 2005

                              Principles of Congestion Control

                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                              a top-10 problem

                              3 Transport Layer 94Comp 361 Spring 2005

                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                              large delays when congestedmaximum achievable throughput

                              3 Transport Layer 95Comp 361 Spring 2005

                              Causescosts of congestion scenario 2

                              one router finite buffers sender retransmission of lost packet

                              3 Transport Layer 96Comp 361 Spring 2005

                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                              λin λout=

                              λin λoutgtλ

                              inλout

                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                              (c)(a) (b)

                              3 Transport Layer 97Comp 361 Spring 2005

                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                              λin

                              Q what happens as and increase λ

                              in

                              3 Transport Layer 98Comp 361 Spring 2005

                              Causescosts of congestion scenario 3

                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                              3 Transport Layer 99Comp 361 Spring 2005

                              Approaches towards congestion control

                              Two broad approaches towards congestion control

                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                              Network-assisted congestion controlrouters provide feedback to end systems

                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                              3 Transport Layer 100Comp 361 Spring 2005

                              Case study ATM ABR congestion control

                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                              RM cells returned to sender by receiver with bits intact

                              small exception ndash see next page

                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                              sender should use available bandwidth

                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                              3 Transport Layer 101Comp 361 Spring 2005

                              Case study ATM ABR congestion control

                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                              3 Transport Layer 102Comp 361 Spring 2005

                              Chapter 3 outline

                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                              35 Connection-oriented transport TCP

                              segment structurereliable data transferflow controlconnection management

                              36 Principles of congestion control37 TCP congestion control

                              3 Transport Layer 103Comp 361 Spring 2005

                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                              Congwin

                              w segments each with MSS bytes sent in one RTT

                              throughput = w MSSRTT Bytessec

                              3 Transport Layer 104Comp 361 Spring 2005

                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                              LastByteSent-LastByteAcked le CongWin

                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                              3 Transport Layer 105Comp 361 Spring 2005

                              TCP AIMDmultiplicative decrease additive increase increase

                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                              cut CongWin in half after loss event

                              8 Kbytes

                              16 Kbytes

                              24 Kbytes

                              time

                              congestionwindow

                              Long-lived TCP connection

                              3 Transport Layer 106Comp 361 Spring 2005

                              TCP Slow Start

                              When connection begins CongWin = 1 MSS

                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                              available bandwidth may be gtgt MSSRTT

                              desirable to quickly ramp up to respectable rate

                              When connection begins increase rate exponentially fast until first loss event

                              3 Transport Layer 107Comp 361 Spring 2005

                              TCP Slow Start (more)

                              When connection begins increase rate exponentially until first loss event

                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                              Summary initial rate is slow but ramps up exponentially fast

                              Host A

                              one segment

                              RTT

                              Host B

                              time

                              two segments

                              four segments

                              3 Transport Layer 108Comp 361 Spring 2005

                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                              3 Transport Layer 109Comp 361 Spring 2005

                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                              3 Transport Layer 110Comp 361 Spring 2005

                              Summary TCP Congestion Control

                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                              3 Transport Layer 111Comp 361 Spring 2005

                              The Big Picture

                              3 Transport Layer 112Comp 361 Spring 2005

                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                              ACK receipt for previously unackeddata

                              Slow Start (SS)

                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                              set state to ldquoCongestion Avoidancerdquo

                              Resulting in a doubling of CongWin every RTT

                              ACK receipt for previously unackeddata

                              CongestionAvoidance (CA)

                              CongWin = CongWin+MSS (MSSCongWin)

                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                              Loss event detected by triple duplicate ACK

                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                              Enter slow start

                              Duplicate ACK

                              SS or CA Increment duplicate ACK count for segment being acked

                              CongWin and Threshold not changed

                              3 Transport Layer 113Comp 361 Spring 2005

                              TCP throughput

                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                              3 Transport Layer 114Comp 361 Spring 2005

                              TCP Futures

                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                              L = 210-10 WowNew versions of TCP for high-speed needed

                              LRTTMSSsdot221

                              3 Transport Layer 115Comp 361 Spring 2005

                              TCP FairnessFairness goal if K TCP sessions share same

                              bottleneck link of bandwidth R each should have average rate of RK

                              TCP connection 1

                              bottleneckrouter

                              capacity R

                              TCP connection 2

                              3 Transport Layer 116Comp 361 Spring 2005

                              Why is TCP fairTwo competing sessions

                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                              R

                              R

                              equal bandwidth share

                              Connection 1 throughput

                              Conn

                              ecti

                              on 2

                              thr

                              ough

                              p ut

                              congestion avoidance additive increaseloss decrease window by factor of 2

                              congestion avoidance additive increaseloss decrease window by factor of 2

                              3 Transport Layer 117Comp 361 Spring 2005

                              Fairness (more)Fairness and UDP

                              Multimedia apps often do not use TCP

                              do not want rate throttled by congestion control

                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                              Current Research area How to keep UDP from congesting the internet

                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                              3 Transport Layer 118Comp 361 Spring 2005

                              TCP Latency ModelingNotation assumptions

                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                              modeling slow start

                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                              3 Transport Layer 119Comp 361 Spring 2005

                              Fixed Congestion Window (W)Two cases

                              1 WSR gt RTT + SR ACK for first segment in window returns before

                              windowrsquos worth of data sentLatency = 2RTT + OR

                              2 WSR lt RTT + SR ACK for first segment in window returns after

                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                              3 Transport Layer 120Comp 361 Spring 2005

                              Fixed congestion window (1)

                              First caseWSR gt RTT + SR ACK for

                              first segment in window returns before windowrsquos worth of data sent

                              latency = 2RTT + OR

                              3 Transport Layer 121Comp 361 Spring 2005

                              Fixed congestion window (2)

                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                              3 Transport Layer 122Comp 361 Spring 2005

                              TCP Latency Modeling Slow Start (1)

                              Now suppose window grows according to slow start(with no threshold and no loss events)

                              Will show that the delay for one object is

                              RS

                              RSRTTP

                              RORTTLatency P )12(2 minusminus⎥⎦

                              ⎤⎢⎣⎡ +++=

                              where P is the number of times TCP idles at server1min minus= KQP

                              - where Q is the number of times the server idlesif the object were of infinite size

                              - and K is the number of windows that cover the object

                              3 Transport Layer 123Comp 361 Spring 2005

                              TCP Latency Modeling Slow Start (2)

                              RTT

                              initiate TCPconnection

                              requestobject

                              first window= SR

                              second window= 2SR

                              third window= 4SR

                              fourth window= 8SR

                              completetransmissionobject

                              delivered

                              time atclient

                              time atserver

                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                              Server idles P=2 times

                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                              Server idles P = minK-1Q times

                              3 Transport Layer 124Comp 361 Spring 2005

                              TCP Latency Modeling (3)

                              ementacknowledg receivesserver until

                              segment send tostartsserver whenfrom time=+ RTTRS

                              RS

                              RSRTTPRTT

                              RO

                              RSRTT

                              RSRTT

                              RO

                              idleTimeRTTRO

                              P

                              kP

                              k

                              P

                              pp

                              )12(][2

                              ]2[2

                              2delay

                              1

                              1

                              1

                              minusminus+++=

                              minus+++=

                              ++=

                              minus

                              =

                              =

                              sum

                              sum

                              th window after the timeidle 2 1 kRSRTT

                              RS k =⎥⎦

                              ⎤⎢⎣⎡ minus+

                              +minus

                              window kth the transmit totime2 1 =minus

                              RSk

                              RTT

                              initiate TCPconnection

                              requestobject

                              first window= SR

                              second window= 2SR

                              third window= 4SR

                              fourth window= 8SR

                              completetransmissionobject

                              delivered

                              time atclient

                              time atserver

                              3 Transport Layer 125Comp 361 Spring 2005

                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                              How do we calculate K

                              ⎥⎥⎤

                              ⎢⎢⎡ +=

                              +ge=

                              geminus=

                              ge+++=

                              ge+++=minus

                              minus

                              )1(log

                              )1(logmin

                              12min

                              222min222min

                              2

                              2

                              110

                              110

                              SO

                              SOkk

                              SOk

                              SOkOSSSkK

                              k

                              k

                              k

                              L

                              L

                              Calculation of Q number of idles for infinite-size objectis similar

                              3 Transport Layer 126Comp 361 Spring 2005

                              HTTP ModelingAssume Web page consists of

                              1 base HTML page (of size O bits)M images (each of size O bits)

                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                              3 Transport Layer 127Comp 361 Spring 2005

                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                              02468

                              101214161820

                              28Kbps

                              100Kbps

                              1 Mbps 10Mbps

                              non-persistent

                              persistent

                              parallel non-persistent

                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                              3 Transport Layer 128Comp 361 Spring 2005

                              HTTP Response time (in seconds)

                              0

                              10

                              20

                              30

                              40

                              50

                              60

                              70

                              28Kbps

                              100Kbps

                              1 Mbps 10Mbps

                              non-persistent

                              persistent

                              parallel non-persistent

                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                              3 Transport Layer 129Comp 361 Spring 2005

                              Chapter 3 Summaryprinciples behind transport layer services

                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                              instantiation and implementation in the Internet

                              UDPTCP

                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                              • Chapter 3 Transport Layer last revised 160305
                              • Chapter 3 outline
                              • Transport services and protocols
                              • Transport vs network layer
                              • Transport-layer protocols
                              • Chapter 3 outline
                              • Multiplexingdemultiplexing
                              • Multiplexingdemultiplexing
                              • How demultiplexing works
                              • Connectionless demultiplexing
                              • Connectionless demux (cont)
                              • Connection-oriented demux
                              • Connection-oriented demux (cont)
                              • Connection-oriented demux Threaded Web Server
                              • Chapter 3 outline
                              • UDP User Datagram Protocol [RFC 768]
                              • UDP more
                              • UDP checksum
                              • Chapter 3 outline
                              • Principles of Reliable data transfer
                              • Reliable data transfer getting started
                              • Reliable data transfer getting started
                              • Incremental Improvements
                              • Rdt10 reliable transfer over a reliable channel
                              • Rdt20 channel with bit errors
                              • rdt20 FSM specification
                              • rdt20 operation with no errors
                              • rdt20 error scenario
                              • rdt20 has a fatal flaw
                              • rdt21 sender handles garbled ACKNAKs
                              • rdt21 receiver handles garbled ACKNAKs
                              • rdt21 discussion
                              • rdt22 a NAK-free protocol
                              • rdt22 sender receiver fragments
                              • rdt30 channels with errors and loss
                              • rdt30 sender
                              • rdt30 in action
                              • rdt30 in action
                              • Performance of rdt30
                              • rdt30 stop-and-wait operation
                              • Pipelined protocols
                              • Pipelined protocols
                              • Pipelining increased utilization
                              • Go-Back-N
                              • GBN Sender
                              • GBN sender extended FSM
                              • GBN receiver extended FSM
                              • More on receiver
                              • GBN inaction
                              • Selective Repeat
                              • Selective repeat sender receiver windows
                              • Selective repeat
                              • Selective repeat in action
                              • Selective repeat dilemma
                              • Chapter 3 outline
                              • TCP Overview RFCs 793 1122 1323 2018 2581
                              • More TCP Details
                              • Even More TCP Details
                              • TCP segment structure
                              • TCP seq rsquos and ACKs
                              • TCP Round Trip Time and Timeout
                              • TCP Round Trip Time and Timeout
                              • Example RTT estimation
                              • TCP Round Trip Time and Timeout
                              • Chapter 3 outline
                              • TCP reliable data transfer
                              • TCP sender events
                              • TCP sender(simplified)
                              • TCP retransmission scenarios
                              • TCP retransmission scenarios (more)
                              • TCP ACK generation [RFC 1122 RFC 2581]
                              • More on Sender Policies
                              • Fast Retransmit
                              • Fast retransmit algorithm
                              • TCP GBN or Selective Repeat
                              • Chapter 3 outline
                              • TCP Flow Control
                              • TCP Flow Control
                              • TCP segment structure
                              • TCP Flow control how it works
                              • Technical Issue
                              • Chapter 3 outline
                              • TCP Connection Management
                              • TCP Connection Management (cont)
                              • TCP Connection Management (cont)
                              • TCP Connection Management (cont)
                              • TCP Connection Management (cont)
                              • A few special cases
                              • Chapter 3 outline
                              • Principles of Congestion Control
                              • Causescosts of congestion scenario 1
                              • Causescosts of congestion scenario 2
                              • Causescosts of congestion scenario 3
                              • Causescosts of congestion scenario 3
                              • Approaches towards congestion control
                              • Case study ATM ABR congestion control
                              • Case study ATM ABR congestion control
                              • Chapter 3 outline
                              • TCP Congestion Control
                              • TCP AIMD
                              • TCP Slow Start
                              • TCP Slow Start (more)
                              • Summary TCP Congestion Control
                              • The Big Picture
                              • TCP sender congestion control
                              • TCP throughput
                              • TCP Futures
                              • TCP Fairness
                              • Why is TCP fair
                              • Fairness (more)
                              • TCP Latency Modeling
                              • Fixed Congestion Window (W)
                              • Fixed congestion window (1)
                              • Fixed congestion window (2)
                              • TCP Latency Modeling Slow Start (1)
                              • TCP Latency Modeling Slow Start (2)
                              • TCP Latency Modeling (3)
                              • TCP Latency Modeling (4)
                              • HTTP Modeling
                              • Chapter 3 Summary

                                3 Transport Layer 16Comp 361 Spring 2005

                                UDP User Datagram Protocol [RFC 768]

                                ldquono frillsrdquo ldquobare bonesrdquoInternet transport protocolldquobest effortrdquo service UDP segments may be

                                lostdelivered out of order to app

                                connectionlessno handshaking between UDP sender receivereach UDP segment handled independently of others

                                Why is there a UDPno connection establishment (which can add delay)simple no connection state at sender receiversmall segment header (8 Bytes)no congestion control UDP can blast away as fast as desired

                                3 Transport Layer 17Comp 361 Spring 2005

                                UDP moreoften used for streaming multimedia apps

                                loss tolerantrate sensitive

                                other UDP uses (why)

                                DNS small delaySNMP stressful cond

                                reliable transfer over UDP add reliability at application layer

                                application-specific error recover

                                source port dest port

                                32 bits

                                Applicationdata

                                (message)

                                length checksumLength in

                                bytes of UDPsegmentincluding

                                header

                                UDP segment format

                                3 Transport Layer 18Comp 361 Spring 2005

                                UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                                segment

                                Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                                NO - error detectedYES - no error detected But maybe errors nonetheless More later

                                Receiver may choose to discard segment or send a warning to app in case error

                                Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                                3 Transport Layer 19Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 20Comp 361 Spring 2005

                                Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                                characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                                3 Transport Layer 21Comp 361 Spring 2005

                                Reliable data transfer getting started

                                sendside

                                receiveside

                                rdt_send() called from above (eg by app) Passed data to

                                deliver to receiver upper layer

                                udt_send() called by rdtto transfer packet over

                                unreliable channel to receiver

                                rdt_rcv() called when packet arrives on rcv-side of channel

                                deliver_data() called by rdt to deliver data to upper

                                3 Transport Layer 22Comp 361 Spring 2005

                                Reliable data transfer getting startedWersquoll

                                incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                state1

                                state2

                                event causing state transitionactions taken on state transition

                                state when in this ldquostaterdquo next state

                                uniquely determined by next event

                                eventactions

                                3 Transport Layer 23Comp 361 Spring 2005

                                Incremental Improvements

                                rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                rdt21 deals with corrupted ACKSNAKS

                                rdt22 like rdt21 but does not need NAKs

                                Rdt30 Allows packets to be lost

                                Rdt10 reliable transfer over a reliable channel

                                underlying channel perfectly reliableno bit errorsno loss of packets

                                separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                Wait for call from above packet = make_pkt(data)

                                udt_send(packet)

                                rdt_send(data)extract (packetdata)deliver_data(data)

                                Wait for call from

                                below

                                rdt_rcv(packet)

                                sender receiver

                                3 Transport Layer 24Comp 361 Spring 2005

                                3 Transport Layer 25Comp 361 Spring 2005

                                Rdt20 channel with bit errors

                                underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                3 Transport Layer 26Comp 361 Spring 2005

                                rdt20 FSM specification

                                Wait for call from above

                                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                udt_send(NAK)

                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                Wait for ACK or

                                NAK

                                rdt_send(data)

                                receiver

                                Wait for call from

                                below

                                Λ

                                sender

                                3 Transport Layer 27Comp 361 Spring 2005

                                rdt20 operation with no errors

                                Wait for call from above

                                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                udt_send(NAK)

                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                Wait for ACK or

                                NAK

                                Wait for call from

                                below

                                rdt_send(data)

                                Λ

                                3 Transport Layer 28Comp 361 Spring 2005

                                rdt20 error scenario

                                Wait for call from above

                                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                udt_send(NAK)

                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                Wait for ACK or

                                NAK

                                Wait for call from

                                below

                                rdt_send(data)

                                Λ

                                3 Transport Layer 29Comp 361 Spring 2005

                                rdt20 has a fatal flawWhat happens if ACKNAK

                                corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                Sender sends one packet then waits for receiver response

                                stop and wait

                                3 Transport Layer 30Comp 361 Spring 2005

                                Sender whenever sender receives control message it sends a packet to receiver

                                A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                Note ACKNAK do not contain sequence

                                3 Transport Layer 31Comp 361 Spring 2005

                                rdt21 sender handles garbled ACKNAKs

                                Wait for call 0 from

                                above

                                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                rdt_send(data)

                                Wait for ACK or NAK 0 udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                rdt_send(data)

                                udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                Wait forcall 1 from

                                above

                                Wait for ACK or NAK 1

                                ΛΛ

                                3 Transport Layer 32Comp 361 Spring 2005

                                rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                ampamp has_seq0(rcvpkt)

                                Wait for 0 from below

                                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                Wait for 1 from below

                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                3 Transport Layer 33Comp 361 Spring 2005

                                rdt21 discussion

                                Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                Receivermust check if received packet is duplicate

                                state indicates whether 0 or 1 is expected pkt seq

                                note receiver can notknow if its last ACKNAK received OK at sender

                                3 Transport Layer 34Comp 361 Spring 2005

                                rdt22 a NAK-free protocol

                                same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                duplicate ACK at sender results in same action as NAK retransmit current pkt

                                3 Transport Layer 35Comp 361 Spring 2005

                                rdt22 sender receiver fragments

                                Wait for call 0 from

                                above

                                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                rdt_send(data)

                                udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                isACK(rcvpkt1) )

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                Wait for ACK

                                0sender FSM

                                fragment

                                Wait for 0 from below

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                has_seq1(rcvpkt))

                                udt_send(sndpkt)receiver FSM

                                fragment

                                Λ

                                3 Transport Layer 36Comp 361 Spring 2005

                                rdt30 channels with errors and loss

                                New assumptionunderlying channel can also lose packets (data or ACKs)

                                checksum seq ACKs retransmissions will be of help but not enough

                                Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                requires countdown timer

                                3 Transport Layer 37Comp 361 Spring 2005

                                rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                rdt_send(data)

                                Wait for

                                ACK0

                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                Wait for call 1 from

                                above

                                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                rdt_send(data)

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                stop_timerstop_timer

                                udt_send(sndpkt)start_timer

                                timeout

                                udt_send(sndpkt)start_timer

                                timeout

                                rdt_rcv(rcvpkt)

                                Wait for call 0from

                                above

                                Wait for

                                ACK1

                                Λrdt_rcv(rcvpkt)

                                ΛΛ

                                Λ

                                3 Transport Layer 38Comp 361 Spring 2005

                                rdt30 in action

                                3 Transport Layer 39Comp 361 Spring 2005

                                rdt30 in action

                                3 Transport Layer 40Comp 361 Spring 2005

                                Performance of rdt30

                                rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                L (packet length in bits)R (transmission rate bps)

                                8kbpkt109 bsec

                                Ttransmit = = = 8 microsec

                                U sender =

                                00830008

                                = 000027 L R RTT + L R

                                =

                                U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                rdt30 stop-and-wait operation

                                first packet bit transmitted t = 0

                                sender receiver

                                RTT

                                last packet bit transmitted t = L R

                                first packet bit arriveslast packet bit arrives send ACK

                                ACK arrives send next packet t = RTT + L R

                                U sender =

                                008 30008

                                = 000027 L R RTT + L R

                                =

                                3 Transport Layer 41Comp 361 Spring 2005

                                3 Transport Layer 42Comp 361 Spring 2005

                                Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                range of sequence numbers must be increasedbuffering at sender andor receiver

                                3 Transport Layer 43Comp 361 Spring 2005

                                Pipelined protocols

                                Advantage much better bandwidth utilization than stop-and-wait

                                Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                Note TCP is not exactly either

                                Pipelining increased utilization

                                first packet bit transmitted t = 0

                                sender receiver

                                RTT

                                last bit transmitted t = L R

                                first packet bit arriveslast packet bit arrives send ACK

                                ACK arrives send next packet t = RTT + L R

                                last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                U sender =

                                02430008

                                = 00008 3 L R RTT + L R

                                =

                                Increase utilizationby a factor of 3

                                3 Transport Layer 44Comp 361 Spring 2005

                                3 Transport Layer 45Comp 361 Spring 2005

                                Go-Back-NSender

                                k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                3 Transport Layer 46Comp 361 Spring 2005

                                GBN Sender

                                rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                Timeout resends ALL packets that have been sent but not yet acknowledged

                                This is only event that triggers resend

                                3 Transport Layer 47Comp 361 Spring 2005

                                GBN sender extended FSMrdt_send(data)

                                Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                timeout

                                if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                start_timernextseqnum++

                                elserefuse_data(data)

                                base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                stop_timerelse

                                start_timer

                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                base=1nextseqnum=1

                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                Λ

                                3 Transport Layer 48Comp 361 Spring 2005

                                GBN receiver extended FSM

                                Wait

                                udt_send(sndpkt)default

                                rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                expectedseqnum=1sndpkt =

                                make_pkt(0ACKchksum)

                                Λ

                                If expected packet receivedSend ACK and deliver packet upstairs

                                If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                3 Transport Layer 49Comp 361 Spring 2005

                                More on receiver

                                The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                3 Transport Layer 50Comp 361 Spring 2005

                                GBN inaction

                                GBN is easy to code but might have performance problems

                                In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                3 Transport Layer 51Comp 361 Spring 2005

                                3 Transport Layer 52Comp 361 Spring 2005

                                Selective Repeat

                                receiver individually acknowledges all correctly received pkts

                                buffers pkts as needed for eventual in-order delivery to upper layer

                                sender only resends pkts for which ACK not received

                                sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                3 Transport Layer 53Comp 361 Spring 2005

                                Selective repeat sender receiver windows

                                3 Transport Layer 54Comp 361 Spring 2005

                                Selective repeat

                                pkt n in [rcvbase rcvbase+N-1]

                                send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                pkt n in [rcvbase-Nrcvbase-1]

                                ACK(n) (note this is a reACK)

                                otherwiseignore

                                receiverdata from above

                                if next available seq in window send pkt

                                timeout(n)resend pkt n restart timer

                                ACK(n) in [sendbasesendbase+N]

                                mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                sender

                                3 Transport Layer 55Comp 361 Spring 2005

                                Selective repeat in action

                                3 Transport Layer 56Comp 361 Spring 2005

                                Selective repeatdilemma

                                Example seq rsquos 0 1 2 3window size=3

                                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                Q what is relationship between seq size and window size

                                3 Transport Layer 57Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 58Comp 361 Spring 2005

                                TCP Overview RFCs 793 1122 1323 2018 2581

                                full duplex databi-directional data flow in same connectionMSS maximum segment size

                                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                flow controlledsender will not overwhelm receiver

                                point-to-pointone sender one receiver

                                reliable in-order byte steam

                                no ldquomessage boundariesrdquopipelined

                                TCP congestion and flow control set window size

                                send amp receive buffers

                                socketdoor

                                TCPsend buffer

                                TCPreceive buffer

                                socketdoor

                                segment

                                applicationwrites data

                                applicationreads data

                                3 Transport Layer 59Comp 361 Spring 2005

                                More TCP DetailsMaximum Segment Size (MSS)

                                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                Application Data + TCP Header = TCP Segment

                                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                (again no payload)Client responds with third special segment

                                This can contain payload

                                3 Transport Layer 60Comp 361 Spring 2005

                                Even More TCP Details

                                A TCP connection between client and server creates in both client and server

                                (i) buffers(ii) variables and

                                (iii) a socket connection to process

                                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                any of the network elements between the host and server

                                3 Transport Layer 61Comp 361 Spring 2005

                                TCP segment structure

                                source port dest port

                                32 bits

                                applicationdata

                                (variable length)

                                sequence numberacknowledgement number

                                Receive windowUrg data pnterchecksum

                                FSRPAUheadlen

                                notused

                                Options (variable length)

                                URG urgent data (generally not used)

                                ACK ACK valid

                                PSH push data now(generally not used)

                                RST SYN FINconnection estab(setup teardown

                                commands)

                                bytes rcvr willingto accept

                                Internetchecksum

                                (as in UDP)

                                countingby bytes of data(not segments)

                                3 Transport Layer 62Comp 361 Spring 2005

                                TCP seq rsquos and ACKsSeq rsquos

                                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                ACKsseq of next byte expected from other sidecumulative ACK

                                Q how receiver handles out-of-order segments

                                A TCP spec doesnrsquot say - up to implementer

                                Host BHost A

                                Seq=42 ACK=79 data = lsquoCrsquo

                                Seq=79 ACK=43 data = lsquoCrsquo

                                Seq=43 ACK=80

                                Usertypes

                                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                back lsquoCrsquo

                                host ACKsreceipt

                                of echoedlsquoCrsquo

                                timesimple telnet scenario

                                3 Transport Layer 63Comp 361 Spring 2005

                                TCP Round Trip Time and Timeout

                                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                average several recent measurements not just current SampleRTT

                                Q how to set TCP timeout valuelonger than RTT

                                but RTT variestoo short premature timeout

                                unnecessary retransmissions

                                too long slow reaction to segment loss

                                3 Transport Layer 64Comp 361 Spring 2005

                                TCP Round Trip Time and Timeout

                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                3 Transport Layer 65Comp 361 Spring 2005

                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                100

                                150

                                200

                                250

                                300

                                350

                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                time (seconnds)

                                RTT

                                (mill

                                iseco

                                nds)

                                SampleRTT Estimated RTT

                                3 Transport Layer 66Comp 361 Spring 2005

                                TCP Round Trip Time and Timeout

                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                (typically β = 025)

                                Then set timeout interval

                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                3 Transport Layer 67Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 68Comp 361 Spring 2005

                                TCP reliable data transfer

                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                Retransmissions are triggered by

                                timeout eventsduplicate acks

                                Initially consider simplified TCP sender

                                ignore duplicate acksignore flow control congestion control

                                3 Transport Layer 69Comp 361 Spring 2005

                                TCP sender eventsdata rcvd from app

                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                timeoutretransmit segment that caused timeoutrestart timer

                                Ack rcvdIf acknowledges previously unackedsegments

                                update what is known to be ackedstart timer if there are outstanding segments

                                TCP sender(simplified)

                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                loop (forever) switch(event)

                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                event timer timeoutretransmit not-yet-acknowledged segment with

                                smallest sequence numberstart timer

                                event ACK received with ACK field value of y if (y gt SendBase)

                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                start timer

                                end of loop forever

                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                3 Transport Layer 70Comp 361 Spring 2005

                                3 Transport Layer 71Comp 361 Spring 2005

                                TCP retransmission scenariosHost A

                                Seq=100 20 bytes data

                                ACK=100

                                timepremature timeout

                                Host B

                                Seq=92 8 bytes data

                                ACK=120

                                Seq=92 8 bytes data

                                Seq=

                                92 t

                                imeo

                                ut

                                ACK=120

                                Host A

                                Seq=92 8 bytes data

                                ACK=100

                                loss

                                tim

                                eout

                                lost ACK scenario

                                Host B

                                X

                                Seq=92 8 bytes data

                                ACK=100

                                time

                                SendBase= 120

                                SendBase= 120

                                Sendbase= 100

                                Seq=

                                92 t

                                imeo

                                utSendBase

                                = 100

                                3 Transport Layer 72Comp 361 Spring 2005

                                TCP retransmission scenarios (more)Host A

                                Seq=92 8 bytes data

                                ACK=100

                                loss

                                tim

                                eout

                                Cumulative ACK scenario

                                Host B

                                X

                                Seq=100 20 bytes data

                                ACK=120

                                time

                                SendBase= 120

                                3 Transport Layer 73Comp 361 Spring 2005

                                TCP ACK generation [RFC 1122 RFC 2581]

                                Event at Receiver

                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                Arrival of segment that partially or completely fills gap

                                TCP Receiver action

                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                Immediately send single cumulative ACK ACKing both in-order segments

                                Immediately send duplicate ACK indicating seq of next expected byte

                                Immediate send ACK provided thatsegment starts at lower end of gap

                                3 Transport Layer 74Comp 361 Spring 2005

                                More on Sender Policies

                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                3 Transport Layer 75Comp 361 Spring 2005

                                Fast Retransmit

                                Time-out period often relatively long

                                long delay before resending lost packet

                                Detect lost segments via duplicate ACKs

                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                fast retransmit resend segment before timer expires

                                3 Transport Layer 76Comp 361 Spring 2005

                                Fast retransmit algorithm

                                event ACK received with ACK field value of y if (y gt SendBase)

                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                start timer

                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                resend segment with sequence number y

                                a duplicate ACK for already ACKed segment

                                fast retransmit

                                3 Transport Layer 77Comp 361 Spring 2005

                                TCP GBN or Selective Repeat

                                Basic TCP looks a lot like GBN

                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                This looks a lot like Selective Repeat

                                TCP is a hybrid

                                3 Transport Layer 78Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 79Comp 361 Spring 2005

                                TCP Flow Control

                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                3 Transport Layer 80Comp 361 Spring 2005

                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                transmitting too muchtoo fast

                                flow controlreceive side of TCP connection has a receive buffer

                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                app process may be slow at reading from buffer

                                3 Transport Layer 81Comp 361 Spring 2005

                                TCP segment structure

                                source port dest port

                                32 bits

                                applicationdata

                                (variable length)

                                sequence numberacknowledgement number

                                Receive windowUrg data pnterchecksum

                                FSRPAUheadlen

                                notused

                                Options (variable length)

                                URG urgent data (generally not used)

                                ACK ACK valid

                                PSH push data now(generally not used)

                                RST SYN FINconnection estab(setup teardown

                                commands)

                                bytes rcvr willingto accept

                                Internetchecksum

                                (as in UDP)

                                countingby bytes of data(not segments)

                                3 Transport Layer 82Comp 361 Spring 2005

                                TCP Flow control how it works

                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                LastByteRead]

                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                guarantees receive buffer doesnrsquot overflow

                                3 Transport Layer 83Comp 361 Spring 2005

                                Technical Issue

                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                3 Transport Layer 84Comp 361 Spring 2005

                                Note on UDP

                                UDP has no flow control

                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                3 Transport Layer 85Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 86Comp 361 Spring 2005

                                TCP Connection Management

                                Three way handshakeStep 1 client end system sends

                                TCP SYN control segment to server

                                specifies client_isn the initial seq No application data

                                Step 2 server end system receives SYN replies with SYNACK control segment

                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                seq sbuffers flow control info (eg RcvWindow)

                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                3 Transport Layer 87Comp 361 Spring 2005

                                TCP Connection Management (cont)

                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                Allocate buffersAllocates buffersCan include application data

                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                clientConnection request (SYN=1 seq=client_isn)

                                server

                                Connection granted (SYN=1 server_isn

                                ACK (SYN=0 seq=client_isn+1)

                                ack=client_isn+1)

                                ack=server_isn+1

                                3 Transport Layer 88Comp 361 Spring 2005

                                TCP Connection Management (cont)

                                Closing a connection

                                client closes socketclientSocketclose()

                                Step 1 client end system sends TCP FIN control segment to server

                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                client

                                FIN

                                server

                                ACK

                                ACK

                                FIN

                                close

                                close

                                closed

                                tim

                                ed w

                                ait

                                3 Transport Layer 89Comp 361 Spring 2005

                                TCP Connection Management (cont)

                                Step 3 client receives FIN replies with ACK

                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                Closes down after timed-wait

                                Step 4 server receives ACK Connection closed

                                Note with small modification can handle simultaneous FINs

                                client

                                FIN

                                server

                                ACK

                                ACK

                                FIN

                                closing

                                closing

                                closed

                                tim

                                ed w

                                ait

                                closed

                                3 Transport Layer 90Comp 361 Spring 2005

                                TCP Connection Management (cont)

                                ExampleTCP serverlifecycle

                                Example TCP clientlifecycle

                                3 Transport Layer 91Comp 361 Spring 2005

                                A few special cases

                                Have not discussed what happens if both client and server decide to close down connection at same time

                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                3 Transport Layer 92Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 93Comp 361 Spring 2005

                                Principles of Congestion Control

                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                a top-10 problem

                                3 Transport Layer 94Comp 361 Spring 2005

                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                large delays when congestedmaximum achievable throughput

                                3 Transport Layer 95Comp 361 Spring 2005

                                Causescosts of congestion scenario 2

                                one router finite buffers sender retransmission of lost packet

                                3 Transport Layer 96Comp 361 Spring 2005

                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                λin λout=

                                λin λoutgtλ

                                inλout

                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                (c)(a) (b)

                                3 Transport Layer 97Comp 361 Spring 2005

                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                λin

                                Q what happens as and increase λ

                                in

                                3 Transport Layer 98Comp 361 Spring 2005

                                Causescosts of congestion scenario 3

                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                3 Transport Layer 99Comp 361 Spring 2005

                                Approaches towards congestion control

                                Two broad approaches towards congestion control

                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                Network-assisted congestion controlrouters provide feedback to end systems

                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                3 Transport Layer 100Comp 361 Spring 2005

                                Case study ATM ABR congestion control

                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                RM cells returned to sender by receiver with bits intact

                                small exception ndash see next page

                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                sender should use available bandwidth

                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                3 Transport Layer 101Comp 361 Spring 2005

                                Case study ATM ABR congestion control

                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                3 Transport Layer 102Comp 361 Spring 2005

                                Chapter 3 outline

                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                35 Connection-oriented transport TCP

                                segment structurereliable data transferflow controlconnection management

                                36 Principles of congestion control37 TCP congestion control

                                3 Transport Layer 103Comp 361 Spring 2005

                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                Congwin

                                w segments each with MSS bytes sent in one RTT

                                throughput = w MSSRTT Bytessec

                                3 Transport Layer 104Comp 361 Spring 2005

                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                LastByteSent-LastByteAcked le CongWin

                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                3 Transport Layer 105Comp 361 Spring 2005

                                TCP AIMDmultiplicative decrease additive increase increase

                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                cut CongWin in half after loss event

                                8 Kbytes

                                16 Kbytes

                                24 Kbytes

                                time

                                congestionwindow

                                Long-lived TCP connection

                                3 Transport Layer 106Comp 361 Spring 2005

                                TCP Slow Start

                                When connection begins CongWin = 1 MSS

                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                available bandwidth may be gtgt MSSRTT

                                desirable to quickly ramp up to respectable rate

                                When connection begins increase rate exponentially fast until first loss event

                                3 Transport Layer 107Comp 361 Spring 2005

                                TCP Slow Start (more)

                                When connection begins increase rate exponentially until first loss event

                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                Summary initial rate is slow but ramps up exponentially fast

                                Host A

                                one segment

                                RTT

                                Host B

                                time

                                two segments

                                four segments

                                3 Transport Layer 108Comp 361 Spring 2005

                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                3 Transport Layer 109Comp 361 Spring 2005

                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                3 Transport Layer 110Comp 361 Spring 2005

                                Summary TCP Congestion Control

                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                3 Transport Layer 111Comp 361 Spring 2005

                                The Big Picture

                                3 Transport Layer 112Comp 361 Spring 2005

                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                ACK receipt for previously unackeddata

                                Slow Start (SS)

                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                set state to ldquoCongestion Avoidancerdquo

                                Resulting in a doubling of CongWin every RTT

                                ACK receipt for previously unackeddata

                                CongestionAvoidance (CA)

                                CongWin = CongWin+MSS (MSSCongWin)

                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                Loss event detected by triple duplicate ACK

                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                Enter slow start

                                Duplicate ACK

                                SS or CA Increment duplicate ACK count for segment being acked

                                CongWin and Threshold not changed

                                3 Transport Layer 113Comp 361 Spring 2005

                                TCP throughput

                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                3 Transport Layer 114Comp 361 Spring 2005

                                TCP Futures

                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                L = 210-10 WowNew versions of TCP for high-speed needed

                                LRTTMSSsdot221

                                3 Transport Layer 115Comp 361 Spring 2005

                                TCP FairnessFairness goal if K TCP sessions share same

                                bottleneck link of bandwidth R each should have average rate of RK

                                TCP connection 1

                                bottleneckrouter

                                capacity R

                                TCP connection 2

                                3 Transport Layer 116Comp 361 Spring 2005

                                Why is TCP fairTwo competing sessions

                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                R

                                R

                                equal bandwidth share

                                Connection 1 throughput

                                Conn

                                ecti

                                on 2

                                thr

                                ough

                                p ut

                                congestion avoidance additive increaseloss decrease window by factor of 2

                                congestion avoidance additive increaseloss decrease window by factor of 2

                                3 Transport Layer 117Comp 361 Spring 2005

                                Fairness (more)Fairness and UDP

                                Multimedia apps often do not use TCP

                                do not want rate throttled by congestion control

                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                Current Research area How to keep UDP from congesting the internet

                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                3 Transport Layer 118Comp 361 Spring 2005

                                TCP Latency ModelingNotation assumptions

                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                modeling slow start

                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                3 Transport Layer 119Comp 361 Spring 2005

                                Fixed Congestion Window (W)Two cases

                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                windowrsquos worth of data sentLatency = 2RTT + OR

                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                3 Transport Layer 120Comp 361 Spring 2005

                                Fixed congestion window (1)

                                First caseWSR gt RTT + SR ACK for

                                first segment in window returns before windowrsquos worth of data sent

                                latency = 2RTT + OR

                                3 Transport Layer 121Comp 361 Spring 2005

                                Fixed congestion window (2)

                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                3 Transport Layer 122Comp 361 Spring 2005

                                TCP Latency Modeling Slow Start (1)

                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                Will show that the delay for one object is

                                RS

                                RSRTTP

                                RORTTLatency P )12(2 minusminus⎥⎦

                                ⎤⎢⎣⎡ +++=

                                where P is the number of times TCP idles at server1min minus= KQP

                                - where Q is the number of times the server idlesif the object were of infinite size

                                - and K is the number of windows that cover the object

                                3 Transport Layer 123Comp 361 Spring 2005

                                TCP Latency Modeling Slow Start (2)

                                RTT

                                initiate TCPconnection

                                requestobject

                                first window= SR

                                second window= 2SR

                                third window= 4SR

                                fourth window= 8SR

                                completetransmissionobject

                                delivered

                                time atclient

                                time atserver

                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                Server idles P=2 times

                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                Server idles P = minK-1Q times

                                3 Transport Layer 124Comp 361 Spring 2005

                                TCP Latency Modeling (3)

                                ementacknowledg receivesserver until

                                segment send tostartsserver whenfrom time=+ RTTRS

                                RS

                                RSRTTPRTT

                                RO

                                RSRTT

                                RSRTT

                                RO

                                idleTimeRTTRO

                                P

                                kP

                                k

                                P

                                pp

                                )12(][2

                                ]2[2

                                2delay

                                1

                                1

                                1

                                minusminus+++=

                                minus+++=

                                ++=

                                minus

                                =

                                =

                                sum

                                sum

                                th window after the timeidle 2 1 kRSRTT

                                RS k =⎥⎦

                                ⎤⎢⎣⎡ minus+

                                +minus

                                window kth the transmit totime2 1 =minus

                                RSk

                                RTT

                                initiate TCPconnection

                                requestobject

                                first window= SR

                                second window= 2SR

                                third window= 4SR

                                fourth window= 8SR

                                completetransmissionobject

                                delivered

                                time atclient

                                time atserver

                                3 Transport Layer 125Comp 361 Spring 2005

                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                How do we calculate K

                                ⎥⎥⎤

                                ⎢⎢⎡ +=

                                +ge=

                                geminus=

                                ge+++=

                                ge+++=minus

                                minus

                                )1(log

                                )1(logmin

                                12min

                                222min222min

                                2

                                2

                                110

                                110

                                SO

                                SOkk

                                SOk

                                SOkOSSSkK

                                k

                                k

                                k

                                L

                                L

                                Calculation of Q number of idles for infinite-size objectis similar

                                3 Transport Layer 126Comp 361 Spring 2005

                                HTTP ModelingAssume Web page consists of

                                1 base HTML page (of size O bits)M images (each of size O bits)

                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                3 Transport Layer 127Comp 361 Spring 2005

                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                02468

                                101214161820

                                28Kbps

                                100Kbps

                                1 Mbps 10Mbps

                                non-persistent

                                persistent

                                parallel non-persistent

                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                3 Transport Layer 128Comp 361 Spring 2005

                                HTTP Response time (in seconds)

                                0

                                10

                                20

                                30

                                40

                                50

                                60

                                70

                                28Kbps

                                100Kbps

                                1 Mbps 10Mbps

                                non-persistent

                                persistent

                                parallel non-persistent

                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                3 Transport Layer 129Comp 361 Spring 2005

                                Chapter 3 Summaryprinciples behind transport layer services

                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                instantiation and implementation in the Internet

                                UDPTCP

                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                • Chapter 3 Transport Layer last revised 160305
                                • Chapter 3 outline
                                • Transport services and protocols
                                • Transport vs network layer
                                • Transport-layer protocols
                                • Chapter 3 outline
                                • Multiplexingdemultiplexing
                                • Multiplexingdemultiplexing
                                • How demultiplexing works
                                • Connectionless demultiplexing
                                • Connectionless demux (cont)
                                • Connection-oriented demux
                                • Connection-oriented demux (cont)
                                • Connection-oriented demux Threaded Web Server
                                • Chapter 3 outline
                                • UDP User Datagram Protocol [RFC 768]
                                • UDP more
                                • UDP checksum
                                • Chapter 3 outline
                                • Principles of Reliable data transfer
                                • Reliable data transfer getting started
                                • Reliable data transfer getting started
                                • Incremental Improvements
                                • Rdt10 reliable transfer over a reliable channel
                                • Rdt20 channel with bit errors
                                • rdt20 FSM specification
                                • rdt20 operation with no errors
                                • rdt20 error scenario
                                • rdt20 has a fatal flaw
                                • rdt21 sender handles garbled ACKNAKs
                                • rdt21 receiver handles garbled ACKNAKs
                                • rdt21 discussion
                                • rdt22 a NAK-free protocol
                                • rdt22 sender receiver fragments
                                • rdt30 channels with errors and loss
                                • rdt30 sender
                                • rdt30 in action
                                • rdt30 in action
                                • Performance of rdt30
                                • rdt30 stop-and-wait operation
                                • Pipelined protocols
                                • Pipelined protocols
                                • Pipelining increased utilization
                                • Go-Back-N
                                • GBN Sender
                                • GBN sender extended FSM
                                • GBN receiver extended FSM
                                • More on receiver
                                • GBN inaction
                                • Selective Repeat
                                • Selective repeat sender receiver windows
                                • Selective repeat
                                • Selective repeat in action
                                • Selective repeat dilemma
                                • Chapter 3 outline
                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                • More TCP Details
                                • Even More TCP Details
                                • TCP segment structure
                                • TCP seq rsquos and ACKs
                                • TCP Round Trip Time and Timeout
                                • TCP Round Trip Time and Timeout
                                • Example RTT estimation
                                • TCP Round Trip Time and Timeout
                                • Chapter 3 outline
                                • TCP reliable data transfer
                                • TCP sender events
                                • TCP sender(simplified)
                                • TCP retransmission scenarios
                                • TCP retransmission scenarios (more)
                                • TCP ACK generation [RFC 1122 RFC 2581]
                                • More on Sender Policies
                                • Fast Retransmit
                                • Fast retransmit algorithm
                                • TCP GBN or Selective Repeat
                                • Chapter 3 outline
                                • TCP Flow Control
                                • TCP Flow Control
                                • TCP segment structure
                                • TCP Flow control how it works
                                • Technical Issue
                                • Chapter 3 outline
                                • TCP Connection Management
                                • TCP Connection Management (cont)
                                • TCP Connection Management (cont)
                                • TCP Connection Management (cont)
                                • TCP Connection Management (cont)
                                • A few special cases
                                • Chapter 3 outline
                                • Principles of Congestion Control
                                • Causescosts of congestion scenario 1
                                • Causescosts of congestion scenario 2
                                • Causescosts of congestion scenario 3
                                • Causescosts of congestion scenario 3
                                • Approaches towards congestion control
                                • Case study ATM ABR congestion control
                                • Case study ATM ABR congestion control
                                • Chapter 3 outline
                                • TCP Congestion Control
                                • TCP AIMD
                                • TCP Slow Start
                                • TCP Slow Start (more)
                                • Summary TCP Congestion Control
                                • The Big Picture
                                • TCP sender congestion control
                                • TCP throughput
                                • TCP Futures
                                • TCP Fairness
                                • Why is TCP fair
                                • Fairness (more)
                                • TCP Latency Modeling
                                • Fixed Congestion Window (W)
                                • Fixed congestion window (1)
                                • Fixed congestion window (2)
                                • TCP Latency Modeling Slow Start (1)
                                • TCP Latency Modeling Slow Start (2)
                                • TCP Latency Modeling (3)
                                • TCP Latency Modeling (4)
                                • HTTP Modeling
                                • Chapter 3 Summary

                                  3 Transport Layer 17Comp 361 Spring 2005

                                  UDP moreoften used for streaming multimedia apps

                                  loss tolerantrate sensitive

                                  other UDP uses (why)

                                  DNS small delaySNMP stressful cond

                                  reliable transfer over UDP add reliability at application layer

                                  application-specific error recover

                                  source port dest port

                                  32 bits

                                  Applicationdata

                                  (message)

                                  length checksumLength in

                                  bytes of UDPsegmentincluding

                                  header

                                  UDP segment format

                                  3 Transport Layer 18Comp 361 Spring 2005

                                  UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                                  segment

                                  Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                                  NO - error detectedYES - no error detected But maybe errors nonetheless More later

                                  Receiver may choose to discard segment or send a warning to app in case error

                                  Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                                  3 Transport Layer 19Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 20Comp 361 Spring 2005

                                  Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                                  characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                                  3 Transport Layer 21Comp 361 Spring 2005

                                  Reliable data transfer getting started

                                  sendside

                                  receiveside

                                  rdt_send() called from above (eg by app) Passed data to

                                  deliver to receiver upper layer

                                  udt_send() called by rdtto transfer packet over

                                  unreliable channel to receiver

                                  rdt_rcv() called when packet arrives on rcv-side of channel

                                  deliver_data() called by rdt to deliver data to upper

                                  3 Transport Layer 22Comp 361 Spring 2005

                                  Reliable data transfer getting startedWersquoll

                                  incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                  but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                  state1

                                  state2

                                  event causing state transitionactions taken on state transition

                                  state when in this ldquostaterdquo next state

                                  uniquely determined by next event

                                  eventactions

                                  3 Transport Layer 23Comp 361 Spring 2005

                                  Incremental Improvements

                                  rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                  rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                  rdt21 deals with corrupted ACKSNAKS

                                  rdt22 like rdt21 but does not need NAKs

                                  Rdt30 Allows packets to be lost

                                  Rdt10 reliable transfer over a reliable channel

                                  underlying channel perfectly reliableno bit errorsno loss of packets

                                  separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                  Wait for call from above packet = make_pkt(data)

                                  udt_send(packet)

                                  rdt_send(data)extract (packetdata)deliver_data(data)

                                  Wait for call from

                                  below

                                  rdt_rcv(packet)

                                  sender receiver

                                  3 Transport Layer 24Comp 361 Spring 2005

                                  3 Transport Layer 25Comp 361 Spring 2005

                                  Rdt20 channel with bit errors

                                  underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                  the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                  new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                  3 Transport Layer 26Comp 361 Spring 2005

                                  rdt20 FSM specification

                                  Wait for call from above

                                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                  udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                  udt_send(NAK)

                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                  Wait for ACK or

                                  NAK

                                  rdt_send(data)

                                  receiver

                                  Wait for call from

                                  below

                                  Λ

                                  sender

                                  3 Transport Layer 27Comp 361 Spring 2005

                                  rdt20 operation with no errors

                                  Wait for call from above

                                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                  udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                  udt_send(NAK)

                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                  Wait for ACK or

                                  NAK

                                  Wait for call from

                                  below

                                  rdt_send(data)

                                  Λ

                                  3 Transport Layer 28Comp 361 Spring 2005

                                  rdt20 error scenario

                                  Wait for call from above

                                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                  udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                  udt_send(NAK)

                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                  Wait for ACK or

                                  NAK

                                  Wait for call from

                                  below

                                  rdt_send(data)

                                  Λ

                                  3 Transport Layer 29Comp 361 Spring 2005

                                  rdt20 has a fatal flawWhat happens if ACKNAK

                                  corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                  What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                  Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                  Sender sends one packet then waits for receiver response

                                  stop and wait

                                  3 Transport Layer 30Comp 361 Spring 2005

                                  Sender whenever sender receives control message it sends a packet to receiver

                                  A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                  Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                  Note ACKNAK do not contain sequence

                                  3 Transport Layer 31Comp 361 Spring 2005

                                  rdt21 sender handles garbled ACKNAKs

                                  Wait for call 0 from

                                  above

                                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                  rdt_send(data)

                                  Wait for ACK or NAK 0 udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                  rdt_send(data)

                                  udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                  Wait forcall 1 from

                                  above

                                  Wait for ACK or NAK 1

                                  ΛΛ

                                  3 Transport Layer 32Comp 361 Spring 2005

                                  rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                  ampamp has_seq0(rcvpkt)

                                  Wait for 0 from below

                                  sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                  Wait for 1 from below

                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                  sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                  sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                  sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                  3 Transport Layer 33Comp 361 Spring 2005

                                  rdt21 discussion

                                  Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                  state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                  Receivermust check if received packet is duplicate

                                  state indicates whether 0 or 1 is expected pkt seq

                                  note receiver can notknow if its last ACKNAK received OK at sender

                                  3 Transport Layer 34Comp 361 Spring 2005

                                  rdt22 a NAK-free protocol

                                  same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                  receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                  duplicate ACK at sender results in same action as NAK retransmit current pkt

                                  3 Transport Layer 35Comp 361 Spring 2005

                                  rdt22 sender receiver fragments

                                  Wait for call 0 from

                                  above

                                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                  rdt_send(data)

                                  udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                  isACK(rcvpkt1) )

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                  Wait for ACK

                                  0sender FSM

                                  fragment

                                  Wait for 0 from below

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                  has_seq1(rcvpkt))

                                  udt_send(sndpkt)receiver FSM

                                  fragment

                                  Λ

                                  3 Transport Layer 36Comp 361 Spring 2005

                                  rdt30 channels with errors and loss

                                  New assumptionunderlying channel can also lose packets (data or ACKs)

                                  checksum seq ACKs retransmissions will be of help but not enough

                                  Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                  Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                  retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                  requires countdown timer

                                  3 Transport Layer 37Comp 361 Spring 2005

                                  rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                  rdt_send(data)

                                  Wait for

                                  ACK0

                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                  Wait for call 1 from

                                  above

                                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                  rdt_send(data)

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                  stop_timerstop_timer

                                  udt_send(sndpkt)start_timer

                                  timeout

                                  udt_send(sndpkt)start_timer

                                  timeout

                                  rdt_rcv(rcvpkt)

                                  Wait for call 0from

                                  above

                                  Wait for

                                  ACK1

                                  Λrdt_rcv(rcvpkt)

                                  ΛΛ

                                  Λ

                                  3 Transport Layer 38Comp 361 Spring 2005

                                  rdt30 in action

                                  3 Transport Layer 39Comp 361 Spring 2005

                                  rdt30 in action

                                  3 Transport Layer 40Comp 361 Spring 2005

                                  Performance of rdt30

                                  rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                  L (packet length in bits)R (transmission rate bps)

                                  8kbpkt109 bsec

                                  Ttransmit = = = 8 microsec

                                  U sender =

                                  00830008

                                  = 000027 L R RTT + L R

                                  =

                                  U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                  rdt30 stop-and-wait operation

                                  first packet bit transmitted t = 0

                                  sender receiver

                                  RTT

                                  last packet bit transmitted t = L R

                                  first packet bit arriveslast packet bit arrives send ACK

                                  ACK arrives send next packet t = RTT + L R

                                  U sender =

                                  008 30008

                                  = 000027 L R RTT + L R

                                  =

                                  3 Transport Layer 41Comp 361 Spring 2005

                                  3 Transport Layer 42Comp 361 Spring 2005

                                  Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                  range of sequence numbers must be increasedbuffering at sender andor receiver

                                  3 Transport Layer 43Comp 361 Spring 2005

                                  Pipelined protocols

                                  Advantage much better bandwidth utilization than stop-and-wait

                                  Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                  Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                  Note TCP is not exactly either

                                  Pipelining increased utilization

                                  first packet bit transmitted t = 0

                                  sender receiver

                                  RTT

                                  last bit transmitted t = L R

                                  first packet bit arriveslast packet bit arrives send ACK

                                  ACK arrives send next packet t = RTT + L R

                                  last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                  U sender =

                                  02430008

                                  = 00008 3 L R RTT + L R

                                  =

                                  Increase utilizationby a factor of 3

                                  3 Transport Layer 44Comp 361 Spring 2005

                                  3 Transport Layer 45Comp 361 Spring 2005

                                  Go-Back-NSender

                                  k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                  ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                  Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                  3 Transport Layer 46Comp 361 Spring 2005

                                  GBN Sender

                                  rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                  Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                  Timeout resends ALL packets that have been sent but not yet acknowledged

                                  This is only event that triggers resend

                                  3 Transport Layer 47Comp 361 Spring 2005

                                  GBN sender extended FSMrdt_send(data)

                                  Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                  timeout

                                  if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                  start_timernextseqnum++

                                  elserefuse_data(data)

                                  base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                  stop_timerelse

                                  start_timer

                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                  base=1nextseqnum=1

                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                  Λ

                                  3 Transport Layer 48Comp 361 Spring 2005

                                  GBN receiver extended FSM

                                  Wait

                                  udt_send(sndpkt)default

                                  rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                  expectedseqnum=1sndpkt =

                                  make_pkt(0ACKchksum)

                                  Λ

                                  If expected packet receivedSend ACK and deliver packet upstairs

                                  If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                  3 Transport Layer 49Comp 361 Spring 2005

                                  More on receiver

                                  The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                  3 Transport Layer 50Comp 361 Spring 2005

                                  GBN inaction

                                  GBN is easy to code but might have performance problems

                                  In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                  Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                  3 Transport Layer 51Comp 361 Spring 2005

                                  3 Transport Layer 52Comp 361 Spring 2005

                                  Selective Repeat

                                  receiver individually acknowledges all correctly received pkts

                                  buffers pkts as needed for eventual in-order delivery to upper layer

                                  sender only resends pkts for which ACK not received

                                  sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                  sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                  3 Transport Layer 53Comp 361 Spring 2005

                                  Selective repeat sender receiver windows

                                  3 Transport Layer 54Comp 361 Spring 2005

                                  Selective repeat

                                  pkt n in [rcvbase rcvbase+N-1]

                                  send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                  pkt n in [rcvbase-Nrcvbase-1]

                                  ACK(n) (note this is a reACK)

                                  otherwiseignore

                                  receiverdata from above

                                  if next available seq in window send pkt

                                  timeout(n)resend pkt n restart timer

                                  ACK(n) in [sendbasesendbase+N]

                                  mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                  sender

                                  3 Transport Layer 55Comp 361 Spring 2005

                                  Selective repeat in action

                                  3 Transport Layer 56Comp 361 Spring 2005

                                  Selective repeatdilemma

                                  Example seq rsquos 0 1 2 3window size=3

                                  receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                  Q what is relationship between seq size and window size

                                  3 Transport Layer 57Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 58Comp 361 Spring 2005

                                  TCP Overview RFCs 793 1122 1323 2018 2581

                                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                  flow controlledsender will not overwhelm receiver

                                  point-to-pointone sender one receiver

                                  reliable in-order byte steam

                                  no ldquomessage boundariesrdquopipelined

                                  TCP congestion and flow control set window size

                                  send amp receive buffers

                                  socketdoor

                                  TCPsend buffer

                                  TCPreceive buffer

                                  socketdoor

                                  segment

                                  applicationwrites data

                                  applicationreads data

                                  3 Transport Layer 59Comp 361 Spring 2005

                                  More TCP DetailsMaximum Segment Size (MSS)

                                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                  Application Data + TCP Header = TCP Segment

                                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                  (again no payload)Client responds with third special segment

                                  This can contain payload

                                  3 Transport Layer 60Comp 361 Spring 2005

                                  Even More TCP Details

                                  A TCP connection between client and server creates in both client and server

                                  (i) buffers(ii) variables and

                                  (iii) a socket connection to process

                                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                  any of the network elements between the host and server

                                  3 Transport Layer 61Comp 361 Spring 2005

                                  TCP segment structure

                                  source port dest port

                                  32 bits

                                  applicationdata

                                  (variable length)

                                  sequence numberacknowledgement number

                                  Receive windowUrg data pnterchecksum

                                  FSRPAUheadlen

                                  notused

                                  Options (variable length)

                                  URG urgent data (generally not used)

                                  ACK ACK valid

                                  PSH push data now(generally not used)

                                  RST SYN FINconnection estab(setup teardown

                                  commands)

                                  bytes rcvr willingto accept

                                  Internetchecksum

                                  (as in UDP)

                                  countingby bytes of data(not segments)

                                  3 Transport Layer 62Comp 361 Spring 2005

                                  TCP seq rsquos and ACKsSeq rsquos

                                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                  ACKsseq of next byte expected from other sidecumulative ACK

                                  Q how receiver handles out-of-order segments

                                  A TCP spec doesnrsquot say - up to implementer

                                  Host BHost A

                                  Seq=42 ACK=79 data = lsquoCrsquo

                                  Seq=79 ACK=43 data = lsquoCrsquo

                                  Seq=43 ACK=80

                                  Usertypes

                                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                  back lsquoCrsquo

                                  host ACKsreceipt

                                  of echoedlsquoCrsquo

                                  timesimple telnet scenario

                                  3 Transport Layer 63Comp 361 Spring 2005

                                  TCP Round Trip Time and Timeout

                                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                  average several recent measurements not just current SampleRTT

                                  Q how to set TCP timeout valuelonger than RTT

                                  but RTT variestoo short premature timeout

                                  unnecessary retransmissions

                                  too long slow reaction to segment loss

                                  3 Transport Layer 64Comp 361 Spring 2005

                                  TCP Round Trip Time and Timeout

                                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                  3 Transport Layer 65Comp 361 Spring 2005

                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                  100

                                  150

                                  200

                                  250

                                  300

                                  350

                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                  time (seconnds)

                                  RTT

                                  (mill

                                  iseco

                                  nds)

                                  SampleRTT Estimated RTT

                                  3 Transport Layer 66Comp 361 Spring 2005

                                  TCP Round Trip Time and Timeout

                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                  (typically β = 025)

                                  Then set timeout interval

                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                  3 Transport Layer 67Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 68Comp 361 Spring 2005

                                  TCP reliable data transfer

                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                  Retransmissions are triggered by

                                  timeout eventsduplicate acks

                                  Initially consider simplified TCP sender

                                  ignore duplicate acksignore flow control congestion control

                                  3 Transport Layer 69Comp 361 Spring 2005

                                  TCP sender eventsdata rcvd from app

                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                  timeoutretransmit segment that caused timeoutrestart timer

                                  Ack rcvdIf acknowledges previously unackedsegments

                                  update what is known to be ackedstart timer if there are outstanding segments

                                  TCP sender(simplified)

                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                  loop (forever) switch(event)

                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                  smallest sequence numberstart timer

                                  event ACK received with ACK field value of y if (y gt SendBase)

                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                  start timer

                                  end of loop forever

                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                  3 Transport Layer 70Comp 361 Spring 2005

                                  3 Transport Layer 71Comp 361 Spring 2005

                                  TCP retransmission scenariosHost A

                                  Seq=100 20 bytes data

                                  ACK=100

                                  timepremature timeout

                                  Host B

                                  Seq=92 8 bytes data

                                  ACK=120

                                  Seq=92 8 bytes data

                                  Seq=

                                  92 t

                                  imeo

                                  ut

                                  ACK=120

                                  Host A

                                  Seq=92 8 bytes data

                                  ACK=100

                                  loss

                                  tim

                                  eout

                                  lost ACK scenario

                                  Host B

                                  X

                                  Seq=92 8 bytes data

                                  ACK=100

                                  time

                                  SendBase= 120

                                  SendBase= 120

                                  Sendbase= 100

                                  Seq=

                                  92 t

                                  imeo

                                  utSendBase

                                  = 100

                                  3 Transport Layer 72Comp 361 Spring 2005

                                  TCP retransmission scenarios (more)Host A

                                  Seq=92 8 bytes data

                                  ACK=100

                                  loss

                                  tim

                                  eout

                                  Cumulative ACK scenario

                                  Host B

                                  X

                                  Seq=100 20 bytes data

                                  ACK=120

                                  time

                                  SendBase= 120

                                  3 Transport Layer 73Comp 361 Spring 2005

                                  TCP ACK generation [RFC 1122 RFC 2581]

                                  Event at Receiver

                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                  Arrival of segment that partially or completely fills gap

                                  TCP Receiver action

                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                  Immediately send single cumulative ACK ACKing both in-order segments

                                  Immediately send duplicate ACK indicating seq of next expected byte

                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                  3 Transport Layer 74Comp 361 Spring 2005

                                  More on Sender Policies

                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                  3 Transport Layer 75Comp 361 Spring 2005

                                  Fast Retransmit

                                  Time-out period often relatively long

                                  long delay before resending lost packet

                                  Detect lost segments via duplicate ACKs

                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                  fast retransmit resend segment before timer expires

                                  3 Transport Layer 76Comp 361 Spring 2005

                                  Fast retransmit algorithm

                                  event ACK received with ACK field value of y if (y gt SendBase)

                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                  start timer

                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                  resend segment with sequence number y

                                  a duplicate ACK for already ACKed segment

                                  fast retransmit

                                  3 Transport Layer 77Comp 361 Spring 2005

                                  TCP GBN or Selective Repeat

                                  Basic TCP looks a lot like GBN

                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                  This looks a lot like Selective Repeat

                                  TCP is a hybrid

                                  3 Transport Layer 78Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 79Comp 361 Spring 2005

                                  TCP Flow Control

                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                  3 Transport Layer 80Comp 361 Spring 2005

                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                  transmitting too muchtoo fast

                                  flow controlreceive side of TCP connection has a receive buffer

                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                  app process may be slow at reading from buffer

                                  3 Transport Layer 81Comp 361 Spring 2005

                                  TCP segment structure

                                  source port dest port

                                  32 bits

                                  applicationdata

                                  (variable length)

                                  sequence numberacknowledgement number

                                  Receive windowUrg data pnterchecksum

                                  FSRPAUheadlen

                                  notused

                                  Options (variable length)

                                  URG urgent data (generally not used)

                                  ACK ACK valid

                                  PSH push data now(generally not used)

                                  RST SYN FINconnection estab(setup teardown

                                  commands)

                                  bytes rcvr willingto accept

                                  Internetchecksum

                                  (as in UDP)

                                  countingby bytes of data(not segments)

                                  3 Transport Layer 82Comp 361 Spring 2005

                                  TCP Flow control how it works

                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                  LastByteRead]

                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                  guarantees receive buffer doesnrsquot overflow

                                  3 Transport Layer 83Comp 361 Spring 2005

                                  Technical Issue

                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                  3 Transport Layer 84Comp 361 Spring 2005

                                  Note on UDP

                                  UDP has no flow control

                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                  3 Transport Layer 85Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 86Comp 361 Spring 2005

                                  TCP Connection Management

                                  Three way handshakeStep 1 client end system sends

                                  TCP SYN control segment to server

                                  specifies client_isn the initial seq No application data

                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                  seq sbuffers flow control info (eg RcvWindow)

                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                  3 Transport Layer 87Comp 361 Spring 2005

                                  TCP Connection Management (cont)

                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                  Allocate buffersAllocates buffersCan include application data

                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                  clientConnection request (SYN=1 seq=client_isn)

                                  server

                                  Connection granted (SYN=1 server_isn

                                  ACK (SYN=0 seq=client_isn+1)

                                  ack=client_isn+1)

                                  ack=server_isn+1

                                  3 Transport Layer 88Comp 361 Spring 2005

                                  TCP Connection Management (cont)

                                  Closing a connection

                                  client closes socketclientSocketclose()

                                  Step 1 client end system sends TCP FIN control segment to server

                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                  client

                                  FIN

                                  server

                                  ACK

                                  ACK

                                  FIN

                                  close

                                  close

                                  closed

                                  tim

                                  ed w

                                  ait

                                  3 Transport Layer 89Comp 361 Spring 2005

                                  TCP Connection Management (cont)

                                  Step 3 client receives FIN replies with ACK

                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                  Closes down after timed-wait

                                  Step 4 server receives ACK Connection closed

                                  Note with small modification can handle simultaneous FINs

                                  client

                                  FIN

                                  server

                                  ACK

                                  ACK

                                  FIN

                                  closing

                                  closing

                                  closed

                                  tim

                                  ed w

                                  ait

                                  closed

                                  3 Transport Layer 90Comp 361 Spring 2005

                                  TCP Connection Management (cont)

                                  ExampleTCP serverlifecycle

                                  Example TCP clientlifecycle

                                  3 Transport Layer 91Comp 361 Spring 2005

                                  A few special cases

                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                  3 Transport Layer 92Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 93Comp 361 Spring 2005

                                  Principles of Congestion Control

                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                  a top-10 problem

                                  3 Transport Layer 94Comp 361 Spring 2005

                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                  large delays when congestedmaximum achievable throughput

                                  3 Transport Layer 95Comp 361 Spring 2005

                                  Causescosts of congestion scenario 2

                                  one router finite buffers sender retransmission of lost packet

                                  3 Transport Layer 96Comp 361 Spring 2005

                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                  λin λout=

                                  λin λoutgtλ

                                  inλout

                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                  (c)(a) (b)

                                  3 Transport Layer 97Comp 361 Spring 2005

                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                  λin

                                  Q what happens as and increase λ

                                  in

                                  3 Transport Layer 98Comp 361 Spring 2005

                                  Causescosts of congestion scenario 3

                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                  3 Transport Layer 99Comp 361 Spring 2005

                                  Approaches towards congestion control

                                  Two broad approaches towards congestion control

                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                  Network-assisted congestion controlrouters provide feedback to end systems

                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                  3 Transport Layer 100Comp 361 Spring 2005

                                  Case study ATM ABR congestion control

                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                  RM cells returned to sender by receiver with bits intact

                                  small exception ndash see next page

                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                  sender should use available bandwidth

                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                  3 Transport Layer 101Comp 361 Spring 2005

                                  Case study ATM ABR congestion control

                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                  3 Transport Layer 102Comp 361 Spring 2005

                                  Chapter 3 outline

                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                  35 Connection-oriented transport TCP

                                  segment structurereliable data transferflow controlconnection management

                                  36 Principles of congestion control37 TCP congestion control

                                  3 Transport Layer 103Comp 361 Spring 2005

                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                  Congwin

                                  w segments each with MSS bytes sent in one RTT

                                  throughput = w MSSRTT Bytessec

                                  3 Transport Layer 104Comp 361 Spring 2005

                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                  LastByteSent-LastByteAcked le CongWin

                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                  3 Transport Layer 105Comp 361 Spring 2005

                                  TCP AIMDmultiplicative decrease additive increase increase

                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                  cut CongWin in half after loss event

                                  8 Kbytes

                                  16 Kbytes

                                  24 Kbytes

                                  time

                                  congestionwindow

                                  Long-lived TCP connection

                                  3 Transport Layer 106Comp 361 Spring 2005

                                  TCP Slow Start

                                  When connection begins CongWin = 1 MSS

                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                  available bandwidth may be gtgt MSSRTT

                                  desirable to quickly ramp up to respectable rate

                                  When connection begins increase rate exponentially fast until first loss event

                                  3 Transport Layer 107Comp 361 Spring 2005

                                  TCP Slow Start (more)

                                  When connection begins increase rate exponentially until first loss event

                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                  Summary initial rate is slow but ramps up exponentially fast

                                  Host A

                                  one segment

                                  RTT

                                  Host B

                                  time

                                  two segments

                                  four segments

                                  3 Transport Layer 108Comp 361 Spring 2005

                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                  3 Transport Layer 109Comp 361 Spring 2005

                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                  3 Transport Layer 110Comp 361 Spring 2005

                                  Summary TCP Congestion Control

                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                  3 Transport Layer 111Comp 361 Spring 2005

                                  The Big Picture

                                  3 Transport Layer 112Comp 361 Spring 2005

                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                  ACK receipt for previously unackeddata

                                  Slow Start (SS)

                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                  set state to ldquoCongestion Avoidancerdquo

                                  Resulting in a doubling of CongWin every RTT

                                  ACK receipt for previously unackeddata

                                  CongestionAvoidance (CA)

                                  CongWin = CongWin+MSS (MSSCongWin)

                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                  Loss event detected by triple duplicate ACK

                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                  Enter slow start

                                  Duplicate ACK

                                  SS or CA Increment duplicate ACK count for segment being acked

                                  CongWin and Threshold not changed

                                  3 Transport Layer 113Comp 361 Spring 2005

                                  TCP throughput

                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                  3 Transport Layer 114Comp 361 Spring 2005

                                  TCP Futures

                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                  LRTTMSSsdot221

                                  3 Transport Layer 115Comp 361 Spring 2005

                                  TCP FairnessFairness goal if K TCP sessions share same

                                  bottleneck link of bandwidth R each should have average rate of RK

                                  TCP connection 1

                                  bottleneckrouter

                                  capacity R

                                  TCP connection 2

                                  3 Transport Layer 116Comp 361 Spring 2005

                                  Why is TCP fairTwo competing sessions

                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                  R

                                  R

                                  equal bandwidth share

                                  Connection 1 throughput

                                  Conn

                                  ecti

                                  on 2

                                  thr

                                  ough

                                  p ut

                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                  3 Transport Layer 117Comp 361 Spring 2005

                                  Fairness (more)Fairness and UDP

                                  Multimedia apps often do not use TCP

                                  do not want rate throttled by congestion control

                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                  Current Research area How to keep UDP from congesting the internet

                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                  3 Transport Layer 118Comp 361 Spring 2005

                                  TCP Latency ModelingNotation assumptions

                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                  modeling slow start

                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                  3 Transport Layer 119Comp 361 Spring 2005

                                  Fixed Congestion Window (W)Two cases

                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                  3 Transport Layer 120Comp 361 Spring 2005

                                  Fixed congestion window (1)

                                  First caseWSR gt RTT + SR ACK for

                                  first segment in window returns before windowrsquos worth of data sent

                                  latency = 2RTT + OR

                                  3 Transport Layer 121Comp 361 Spring 2005

                                  Fixed congestion window (2)

                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                  3 Transport Layer 122Comp 361 Spring 2005

                                  TCP Latency Modeling Slow Start (1)

                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                  Will show that the delay for one object is

                                  RS

                                  RSRTTP

                                  RORTTLatency P )12(2 minusminus⎥⎦

                                  ⎤⎢⎣⎡ +++=

                                  where P is the number of times TCP idles at server1min minus= KQP

                                  - where Q is the number of times the server idlesif the object were of infinite size

                                  - and K is the number of windows that cover the object

                                  3 Transport Layer 123Comp 361 Spring 2005

                                  TCP Latency Modeling Slow Start (2)

                                  RTT

                                  initiate TCPconnection

                                  requestobject

                                  first window= SR

                                  second window= 2SR

                                  third window= 4SR

                                  fourth window= 8SR

                                  completetransmissionobject

                                  delivered

                                  time atclient

                                  time atserver

                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                  Server idles P=2 times

                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                  Server idles P = minK-1Q times

                                  3 Transport Layer 124Comp 361 Spring 2005

                                  TCP Latency Modeling (3)

                                  ementacknowledg receivesserver until

                                  segment send tostartsserver whenfrom time=+ RTTRS

                                  RS

                                  RSRTTPRTT

                                  RO

                                  RSRTT

                                  RSRTT

                                  RO

                                  idleTimeRTTRO

                                  P

                                  kP

                                  k

                                  P

                                  pp

                                  )12(][2

                                  ]2[2

                                  2delay

                                  1

                                  1

                                  1

                                  minusminus+++=

                                  minus+++=

                                  ++=

                                  minus

                                  =

                                  =

                                  sum

                                  sum

                                  th window after the timeidle 2 1 kRSRTT

                                  RS k =⎥⎦

                                  ⎤⎢⎣⎡ minus+

                                  +minus

                                  window kth the transmit totime2 1 =minus

                                  RSk

                                  RTT

                                  initiate TCPconnection

                                  requestobject

                                  first window= SR

                                  second window= 2SR

                                  third window= 4SR

                                  fourth window= 8SR

                                  completetransmissionobject

                                  delivered

                                  time atclient

                                  time atserver

                                  3 Transport Layer 125Comp 361 Spring 2005

                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                  How do we calculate K

                                  ⎥⎥⎤

                                  ⎢⎢⎡ +=

                                  +ge=

                                  geminus=

                                  ge+++=

                                  ge+++=minus

                                  minus

                                  )1(log

                                  )1(logmin

                                  12min

                                  222min222min

                                  2

                                  2

                                  110

                                  110

                                  SO

                                  SOkk

                                  SOk

                                  SOkOSSSkK

                                  k

                                  k

                                  k

                                  L

                                  L

                                  Calculation of Q number of idles for infinite-size objectis similar

                                  3 Transport Layer 126Comp 361 Spring 2005

                                  HTTP ModelingAssume Web page consists of

                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                  3 Transport Layer 127Comp 361 Spring 2005

                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                  02468

                                  101214161820

                                  28Kbps

                                  100Kbps

                                  1 Mbps 10Mbps

                                  non-persistent

                                  persistent

                                  parallel non-persistent

                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                  3 Transport Layer 128Comp 361 Spring 2005

                                  HTTP Response time (in seconds)

                                  0

                                  10

                                  20

                                  30

                                  40

                                  50

                                  60

                                  70

                                  28Kbps

                                  100Kbps

                                  1 Mbps 10Mbps

                                  non-persistent

                                  persistent

                                  parallel non-persistent

                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                  3 Transport Layer 129Comp 361 Spring 2005

                                  Chapter 3 Summaryprinciples behind transport layer services

                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                  instantiation and implementation in the Internet

                                  UDPTCP

                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                  • Chapter 3 Transport Layer last revised 160305
                                  • Chapter 3 outline
                                  • Transport services and protocols
                                  • Transport vs network layer
                                  • Transport-layer protocols
                                  • Chapter 3 outline
                                  • Multiplexingdemultiplexing
                                  • Multiplexingdemultiplexing
                                  • How demultiplexing works
                                  • Connectionless demultiplexing
                                  • Connectionless demux (cont)
                                  • Connection-oriented demux
                                  • Connection-oriented demux (cont)
                                  • Connection-oriented demux Threaded Web Server
                                  • Chapter 3 outline
                                  • UDP User Datagram Protocol [RFC 768]
                                  • UDP more
                                  • UDP checksum
                                  • Chapter 3 outline
                                  • Principles of Reliable data transfer
                                  • Reliable data transfer getting started
                                  • Reliable data transfer getting started
                                  • Incremental Improvements
                                  • Rdt10 reliable transfer over a reliable channel
                                  • Rdt20 channel with bit errors
                                  • rdt20 FSM specification
                                  • rdt20 operation with no errors
                                  • rdt20 error scenario
                                  • rdt20 has a fatal flaw
                                  • rdt21 sender handles garbled ACKNAKs
                                  • rdt21 receiver handles garbled ACKNAKs
                                  • rdt21 discussion
                                  • rdt22 a NAK-free protocol
                                  • rdt22 sender receiver fragments
                                  • rdt30 channels with errors and loss
                                  • rdt30 sender
                                  • rdt30 in action
                                  • rdt30 in action
                                  • Performance of rdt30
                                  • rdt30 stop-and-wait operation
                                  • Pipelined protocols
                                  • Pipelined protocols
                                  • Pipelining increased utilization
                                  • Go-Back-N
                                  • GBN Sender
                                  • GBN sender extended FSM
                                  • GBN receiver extended FSM
                                  • More on receiver
                                  • GBN inaction
                                  • Selective Repeat
                                  • Selective repeat sender receiver windows
                                  • Selective repeat
                                  • Selective repeat in action
                                  • Selective repeat dilemma
                                  • Chapter 3 outline
                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                  • More TCP Details
                                  • Even More TCP Details
                                  • TCP segment structure
                                  • TCP seq rsquos and ACKs
                                  • TCP Round Trip Time and Timeout
                                  • TCP Round Trip Time and Timeout
                                  • Example RTT estimation
                                  • TCP Round Trip Time and Timeout
                                  • Chapter 3 outline
                                  • TCP reliable data transfer
                                  • TCP sender events
                                  • TCP sender(simplified)
                                  • TCP retransmission scenarios
                                  • TCP retransmission scenarios (more)
                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                  • More on Sender Policies
                                  • Fast Retransmit
                                  • Fast retransmit algorithm
                                  • TCP GBN or Selective Repeat
                                  • Chapter 3 outline
                                  • TCP Flow Control
                                  • TCP Flow Control
                                  • TCP segment structure
                                  • TCP Flow control how it works
                                  • Technical Issue
                                  • Chapter 3 outline
                                  • TCP Connection Management
                                  • TCP Connection Management (cont)
                                  • TCP Connection Management (cont)
                                  • TCP Connection Management (cont)
                                  • TCP Connection Management (cont)
                                  • A few special cases
                                  • Chapter 3 outline
                                  • Principles of Congestion Control
                                  • Causescosts of congestion scenario 1
                                  • Causescosts of congestion scenario 2
                                  • Causescosts of congestion scenario 3
                                  • Causescosts of congestion scenario 3
                                  • Approaches towards congestion control
                                  • Case study ATM ABR congestion control
                                  • Case study ATM ABR congestion control
                                  • Chapter 3 outline
                                  • TCP Congestion Control
                                  • TCP AIMD
                                  • TCP Slow Start
                                  • TCP Slow Start (more)
                                  • Summary TCP Congestion Control
                                  • The Big Picture
                                  • TCP sender congestion control
                                  • TCP throughput
                                  • TCP Futures
                                  • TCP Fairness
                                  • Why is TCP fair
                                  • Fairness (more)
                                  • TCP Latency Modeling
                                  • Fixed Congestion Window (W)
                                  • Fixed congestion window (1)
                                  • Fixed congestion window (2)
                                  • TCP Latency Modeling Slow Start (1)
                                  • TCP Latency Modeling Slow Start (2)
                                  • TCP Latency Modeling (3)
                                  • TCP Latency Modeling (4)
                                  • HTTP Modeling
                                  • Chapter 3 Summary

                                    3 Transport Layer 18Comp 361 Spring 2005

                                    UDP checksumGoal detect ldquoerrorsrdquo (egflipped bits) in transmitted

                                    segment

                                    Receivercompute checksum of received segmentcheck if computed checksum equals checksum field value

                                    NO - error detectedYES - no error detected But maybe errors nonetheless More later

                                    Receiver may choose to discard segment or send a warning to app in case error

                                    Sendertreat segment contents as sequence of 16-bit integerschecksum addition (1rsquo s complement sum) of segment contentssender puts checksum value into UDP checksum field

                                    3 Transport Layer 19Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 20Comp 361 Spring 2005

                                    Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                                    characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                                    3 Transport Layer 21Comp 361 Spring 2005

                                    Reliable data transfer getting started

                                    sendside

                                    receiveside

                                    rdt_send() called from above (eg by app) Passed data to

                                    deliver to receiver upper layer

                                    udt_send() called by rdtto transfer packet over

                                    unreliable channel to receiver

                                    rdt_rcv() called when packet arrives on rcv-side of channel

                                    deliver_data() called by rdt to deliver data to upper

                                    3 Transport Layer 22Comp 361 Spring 2005

                                    Reliable data transfer getting startedWersquoll

                                    incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                    but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                    state1

                                    state2

                                    event causing state transitionactions taken on state transition

                                    state when in this ldquostaterdquo next state

                                    uniquely determined by next event

                                    eventactions

                                    3 Transport Layer 23Comp 361 Spring 2005

                                    Incremental Improvements

                                    rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                    rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                    rdt21 deals with corrupted ACKSNAKS

                                    rdt22 like rdt21 but does not need NAKs

                                    Rdt30 Allows packets to be lost

                                    Rdt10 reliable transfer over a reliable channel

                                    underlying channel perfectly reliableno bit errorsno loss of packets

                                    separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                    Wait for call from above packet = make_pkt(data)

                                    udt_send(packet)

                                    rdt_send(data)extract (packetdata)deliver_data(data)

                                    Wait for call from

                                    below

                                    rdt_rcv(packet)

                                    sender receiver

                                    3 Transport Layer 24Comp 361 Spring 2005

                                    3 Transport Layer 25Comp 361 Spring 2005

                                    Rdt20 channel with bit errors

                                    underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                    the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                    new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                    3 Transport Layer 26Comp 361 Spring 2005

                                    rdt20 FSM specification

                                    Wait for call from above

                                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                    udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                    udt_send(NAK)

                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                    Wait for ACK or

                                    NAK

                                    rdt_send(data)

                                    receiver

                                    Wait for call from

                                    below

                                    Λ

                                    sender

                                    3 Transport Layer 27Comp 361 Spring 2005

                                    rdt20 operation with no errors

                                    Wait for call from above

                                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                    udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                    udt_send(NAK)

                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                    Wait for ACK or

                                    NAK

                                    Wait for call from

                                    below

                                    rdt_send(data)

                                    Λ

                                    3 Transport Layer 28Comp 361 Spring 2005

                                    rdt20 error scenario

                                    Wait for call from above

                                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                    udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                    udt_send(NAK)

                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                    Wait for ACK or

                                    NAK

                                    Wait for call from

                                    below

                                    rdt_send(data)

                                    Λ

                                    3 Transport Layer 29Comp 361 Spring 2005

                                    rdt20 has a fatal flawWhat happens if ACKNAK

                                    corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                    What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                    Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                    Sender sends one packet then waits for receiver response

                                    stop and wait

                                    3 Transport Layer 30Comp 361 Spring 2005

                                    Sender whenever sender receives control message it sends a packet to receiver

                                    A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                    Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                    Note ACKNAK do not contain sequence

                                    3 Transport Layer 31Comp 361 Spring 2005

                                    rdt21 sender handles garbled ACKNAKs

                                    Wait for call 0 from

                                    above

                                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                    rdt_send(data)

                                    Wait for ACK or NAK 0 udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                    rdt_send(data)

                                    udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                    Wait forcall 1 from

                                    above

                                    Wait for ACK or NAK 1

                                    ΛΛ

                                    3 Transport Layer 32Comp 361 Spring 2005

                                    rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                    ampamp has_seq0(rcvpkt)

                                    Wait for 0 from below

                                    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                    Wait for 1 from below

                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                    3 Transport Layer 33Comp 361 Spring 2005

                                    rdt21 discussion

                                    Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                    state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                    Receivermust check if received packet is duplicate

                                    state indicates whether 0 or 1 is expected pkt seq

                                    note receiver can notknow if its last ACKNAK received OK at sender

                                    3 Transport Layer 34Comp 361 Spring 2005

                                    rdt22 a NAK-free protocol

                                    same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                    receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                    duplicate ACK at sender results in same action as NAK retransmit current pkt

                                    3 Transport Layer 35Comp 361 Spring 2005

                                    rdt22 sender receiver fragments

                                    Wait for call 0 from

                                    above

                                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                    rdt_send(data)

                                    udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                    isACK(rcvpkt1) )

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                    Wait for ACK

                                    0sender FSM

                                    fragment

                                    Wait for 0 from below

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                    has_seq1(rcvpkt))

                                    udt_send(sndpkt)receiver FSM

                                    fragment

                                    Λ

                                    3 Transport Layer 36Comp 361 Spring 2005

                                    rdt30 channels with errors and loss

                                    New assumptionunderlying channel can also lose packets (data or ACKs)

                                    checksum seq ACKs retransmissions will be of help but not enough

                                    Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                    Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                    retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                    requires countdown timer

                                    3 Transport Layer 37Comp 361 Spring 2005

                                    rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                    rdt_send(data)

                                    Wait for

                                    ACK0

                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                    Wait for call 1 from

                                    above

                                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                    rdt_send(data)

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                    stop_timerstop_timer

                                    udt_send(sndpkt)start_timer

                                    timeout

                                    udt_send(sndpkt)start_timer

                                    timeout

                                    rdt_rcv(rcvpkt)

                                    Wait for call 0from

                                    above

                                    Wait for

                                    ACK1

                                    Λrdt_rcv(rcvpkt)

                                    ΛΛ

                                    Λ

                                    3 Transport Layer 38Comp 361 Spring 2005

                                    rdt30 in action

                                    3 Transport Layer 39Comp 361 Spring 2005

                                    rdt30 in action

                                    3 Transport Layer 40Comp 361 Spring 2005

                                    Performance of rdt30

                                    rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                    L (packet length in bits)R (transmission rate bps)

                                    8kbpkt109 bsec

                                    Ttransmit = = = 8 microsec

                                    U sender =

                                    00830008

                                    = 000027 L R RTT + L R

                                    =

                                    U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                    rdt30 stop-and-wait operation

                                    first packet bit transmitted t = 0

                                    sender receiver

                                    RTT

                                    last packet bit transmitted t = L R

                                    first packet bit arriveslast packet bit arrives send ACK

                                    ACK arrives send next packet t = RTT + L R

                                    U sender =

                                    008 30008

                                    = 000027 L R RTT + L R

                                    =

                                    3 Transport Layer 41Comp 361 Spring 2005

                                    3 Transport Layer 42Comp 361 Spring 2005

                                    Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                    range of sequence numbers must be increasedbuffering at sender andor receiver

                                    3 Transport Layer 43Comp 361 Spring 2005

                                    Pipelined protocols

                                    Advantage much better bandwidth utilization than stop-and-wait

                                    Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                    Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                    Note TCP is not exactly either

                                    Pipelining increased utilization

                                    first packet bit transmitted t = 0

                                    sender receiver

                                    RTT

                                    last bit transmitted t = L R

                                    first packet bit arriveslast packet bit arrives send ACK

                                    ACK arrives send next packet t = RTT + L R

                                    last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                    U sender =

                                    02430008

                                    = 00008 3 L R RTT + L R

                                    =

                                    Increase utilizationby a factor of 3

                                    3 Transport Layer 44Comp 361 Spring 2005

                                    3 Transport Layer 45Comp 361 Spring 2005

                                    Go-Back-NSender

                                    k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                    ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                    Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                    3 Transport Layer 46Comp 361 Spring 2005

                                    GBN Sender

                                    rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                    Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                    Timeout resends ALL packets that have been sent but not yet acknowledged

                                    This is only event that triggers resend

                                    3 Transport Layer 47Comp 361 Spring 2005

                                    GBN sender extended FSMrdt_send(data)

                                    Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                    timeout

                                    if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                    start_timernextseqnum++

                                    elserefuse_data(data)

                                    base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                    stop_timerelse

                                    start_timer

                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                    base=1nextseqnum=1

                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                    Λ

                                    3 Transport Layer 48Comp 361 Spring 2005

                                    GBN receiver extended FSM

                                    Wait

                                    udt_send(sndpkt)default

                                    rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                    expectedseqnum=1sndpkt =

                                    make_pkt(0ACKchksum)

                                    Λ

                                    If expected packet receivedSend ACK and deliver packet upstairs

                                    If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                    3 Transport Layer 49Comp 361 Spring 2005

                                    More on receiver

                                    The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                    3 Transport Layer 50Comp 361 Spring 2005

                                    GBN inaction

                                    GBN is easy to code but might have performance problems

                                    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                    3 Transport Layer 51Comp 361 Spring 2005

                                    3 Transport Layer 52Comp 361 Spring 2005

                                    Selective Repeat

                                    receiver individually acknowledges all correctly received pkts

                                    buffers pkts as needed for eventual in-order delivery to upper layer

                                    sender only resends pkts for which ACK not received

                                    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                    3 Transport Layer 53Comp 361 Spring 2005

                                    Selective repeat sender receiver windows

                                    3 Transport Layer 54Comp 361 Spring 2005

                                    Selective repeat

                                    pkt n in [rcvbase rcvbase+N-1]

                                    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                    pkt n in [rcvbase-Nrcvbase-1]

                                    ACK(n) (note this is a reACK)

                                    otherwiseignore

                                    receiverdata from above

                                    if next available seq in window send pkt

                                    timeout(n)resend pkt n restart timer

                                    ACK(n) in [sendbasesendbase+N]

                                    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                    sender

                                    3 Transport Layer 55Comp 361 Spring 2005

                                    Selective repeat in action

                                    3 Transport Layer 56Comp 361 Spring 2005

                                    Selective repeatdilemma

                                    Example seq rsquos 0 1 2 3window size=3

                                    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                    Q what is relationship between seq size and window size

                                    3 Transport Layer 57Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 58Comp 361 Spring 2005

                                    TCP Overview RFCs 793 1122 1323 2018 2581

                                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                    flow controlledsender will not overwhelm receiver

                                    point-to-pointone sender one receiver

                                    reliable in-order byte steam

                                    no ldquomessage boundariesrdquopipelined

                                    TCP congestion and flow control set window size

                                    send amp receive buffers

                                    socketdoor

                                    TCPsend buffer

                                    TCPreceive buffer

                                    socketdoor

                                    segment

                                    applicationwrites data

                                    applicationreads data

                                    3 Transport Layer 59Comp 361 Spring 2005

                                    More TCP DetailsMaximum Segment Size (MSS)

                                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                    Application Data + TCP Header = TCP Segment

                                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                    (again no payload)Client responds with third special segment

                                    This can contain payload

                                    3 Transport Layer 60Comp 361 Spring 2005

                                    Even More TCP Details

                                    A TCP connection between client and server creates in both client and server

                                    (i) buffers(ii) variables and

                                    (iii) a socket connection to process

                                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                    any of the network elements between the host and server

                                    3 Transport Layer 61Comp 361 Spring 2005

                                    TCP segment structure

                                    source port dest port

                                    32 bits

                                    applicationdata

                                    (variable length)

                                    sequence numberacknowledgement number

                                    Receive windowUrg data pnterchecksum

                                    FSRPAUheadlen

                                    notused

                                    Options (variable length)

                                    URG urgent data (generally not used)

                                    ACK ACK valid

                                    PSH push data now(generally not used)

                                    RST SYN FINconnection estab(setup teardown

                                    commands)

                                    bytes rcvr willingto accept

                                    Internetchecksum

                                    (as in UDP)

                                    countingby bytes of data(not segments)

                                    3 Transport Layer 62Comp 361 Spring 2005

                                    TCP seq rsquos and ACKsSeq rsquos

                                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                    ACKsseq of next byte expected from other sidecumulative ACK

                                    Q how receiver handles out-of-order segments

                                    A TCP spec doesnrsquot say - up to implementer

                                    Host BHost A

                                    Seq=42 ACK=79 data = lsquoCrsquo

                                    Seq=79 ACK=43 data = lsquoCrsquo

                                    Seq=43 ACK=80

                                    Usertypes

                                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                    back lsquoCrsquo

                                    host ACKsreceipt

                                    of echoedlsquoCrsquo

                                    timesimple telnet scenario

                                    3 Transport Layer 63Comp 361 Spring 2005

                                    TCP Round Trip Time and Timeout

                                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                    average several recent measurements not just current SampleRTT

                                    Q how to set TCP timeout valuelonger than RTT

                                    but RTT variestoo short premature timeout

                                    unnecessary retransmissions

                                    too long slow reaction to segment loss

                                    3 Transport Layer 64Comp 361 Spring 2005

                                    TCP Round Trip Time and Timeout

                                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                    3 Transport Layer 65Comp 361 Spring 2005

                                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                    100

                                    150

                                    200

                                    250

                                    300

                                    350

                                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                    time (seconnds)

                                    RTT

                                    (mill

                                    iseco

                                    nds)

                                    SampleRTT Estimated RTT

                                    3 Transport Layer 66Comp 361 Spring 2005

                                    TCP Round Trip Time and Timeout

                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                    (typically β = 025)

                                    Then set timeout interval

                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                    3 Transport Layer 67Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 68Comp 361 Spring 2005

                                    TCP reliable data transfer

                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                    Retransmissions are triggered by

                                    timeout eventsduplicate acks

                                    Initially consider simplified TCP sender

                                    ignore duplicate acksignore flow control congestion control

                                    3 Transport Layer 69Comp 361 Spring 2005

                                    TCP sender eventsdata rcvd from app

                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                    timeoutretransmit segment that caused timeoutrestart timer

                                    Ack rcvdIf acknowledges previously unackedsegments

                                    update what is known to be ackedstart timer if there are outstanding segments

                                    TCP sender(simplified)

                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                    loop (forever) switch(event)

                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                    smallest sequence numberstart timer

                                    event ACK received with ACK field value of y if (y gt SendBase)

                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                    start timer

                                    end of loop forever

                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                    3 Transport Layer 70Comp 361 Spring 2005

                                    3 Transport Layer 71Comp 361 Spring 2005

                                    TCP retransmission scenariosHost A

                                    Seq=100 20 bytes data

                                    ACK=100

                                    timepremature timeout

                                    Host B

                                    Seq=92 8 bytes data

                                    ACK=120

                                    Seq=92 8 bytes data

                                    Seq=

                                    92 t

                                    imeo

                                    ut

                                    ACK=120

                                    Host A

                                    Seq=92 8 bytes data

                                    ACK=100

                                    loss

                                    tim

                                    eout

                                    lost ACK scenario

                                    Host B

                                    X

                                    Seq=92 8 bytes data

                                    ACK=100

                                    time

                                    SendBase= 120

                                    SendBase= 120

                                    Sendbase= 100

                                    Seq=

                                    92 t

                                    imeo

                                    utSendBase

                                    = 100

                                    3 Transport Layer 72Comp 361 Spring 2005

                                    TCP retransmission scenarios (more)Host A

                                    Seq=92 8 bytes data

                                    ACK=100

                                    loss

                                    tim

                                    eout

                                    Cumulative ACK scenario

                                    Host B

                                    X

                                    Seq=100 20 bytes data

                                    ACK=120

                                    time

                                    SendBase= 120

                                    3 Transport Layer 73Comp 361 Spring 2005

                                    TCP ACK generation [RFC 1122 RFC 2581]

                                    Event at Receiver

                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                    Arrival of segment that partially or completely fills gap

                                    TCP Receiver action

                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                    Immediately send single cumulative ACK ACKing both in-order segments

                                    Immediately send duplicate ACK indicating seq of next expected byte

                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                    3 Transport Layer 74Comp 361 Spring 2005

                                    More on Sender Policies

                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                    3 Transport Layer 75Comp 361 Spring 2005

                                    Fast Retransmit

                                    Time-out period often relatively long

                                    long delay before resending lost packet

                                    Detect lost segments via duplicate ACKs

                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                    fast retransmit resend segment before timer expires

                                    3 Transport Layer 76Comp 361 Spring 2005

                                    Fast retransmit algorithm

                                    event ACK received with ACK field value of y if (y gt SendBase)

                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                    start timer

                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                    resend segment with sequence number y

                                    a duplicate ACK for already ACKed segment

                                    fast retransmit

                                    3 Transport Layer 77Comp 361 Spring 2005

                                    TCP GBN or Selective Repeat

                                    Basic TCP looks a lot like GBN

                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                    This looks a lot like Selective Repeat

                                    TCP is a hybrid

                                    3 Transport Layer 78Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 79Comp 361 Spring 2005

                                    TCP Flow Control

                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                    3 Transport Layer 80Comp 361 Spring 2005

                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                    transmitting too muchtoo fast

                                    flow controlreceive side of TCP connection has a receive buffer

                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                    app process may be slow at reading from buffer

                                    3 Transport Layer 81Comp 361 Spring 2005

                                    TCP segment structure

                                    source port dest port

                                    32 bits

                                    applicationdata

                                    (variable length)

                                    sequence numberacknowledgement number

                                    Receive windowUrg data pnterchecksum

                                    FSRPAUheadlen

                                    notused

                                    Options (variable length)

                                    URG urgent data (generally not used)

                                    ACK ACK valid

                                    PSH push data now(generally not used)

                                    RST SYN FINconnection estab(setup teardown

                                    commands)

                                    bytes rcvr willingto accept

                                    Internetchecksum

                                    (as in UDP)

                                    countingby bytes of data(not segments)

                                    3 Transport Layer 82Comp 361 Spring 2005

                                    TCP Flow control how it works

                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                    LastByteRead]

                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                    guarantees receive buffer doesnrsquot overflow

                                    3 Transport Layer 83Comp 361 Spring 2005

                                    Technical Issue

                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                    3 Transport Layer 84Comp 361 Spring 2005

                                    Note on UDP

                                    UDP has no flow control

                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                    3 Transport Layer 85Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 86Comp 361 Spring 2005

                                    TCP Connection Management

                                    Three way handshakeStep 1 client end system sends

                                    TCP SYN control segment to server

                                    specifies client_isn the initial seq No application data

                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                    seq sbuffers flow control info (eg RcvWindow)

                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                    3 Transport Layer 87Comp 361 Spring 2005

                                    TCP Connection Management (cont)

                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                    Allocate buffersAllocates buffersCan include application data

                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                    clientConnection request (SYN=1 seq=client_isn)

                                    server

                                    Connection granted (SYN=1 server_isn

                                    ACK (SYN=0 seq=client_isn+1)

                                    ack=client_isn+1)

                                    ack=server_isn+1

                                    3 Transport Layer 88Comp 361 Spring 2005

                                    TCP Connection Management (cont)

                                    Closing a connection

                                    client closes socketclientSocketclose()

                                    Step 1 client end system sends TCP FIN control segment to server

                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                    client

                                    FIN

                                    server

                                    ACK

                                    ACK

                                    FIN

                                    close

                                    close

                                    closed

                                    tim

                                    ed w

                                    ait

                                    3 Transport Layer 89Comp 361 Spring 2005

                                    TCP Connection Management (cont)

                                    Step 3 client receives FIN replies with ACK

                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                    Closes down after timed-wait

                                    Step 4 server receives ACK Connection closed

                                    Note with small modification can handle simultaneous FINs

                                    client

                                    FIN

                                    server

                                    ACK

                                    ACK

                                    FIN

                                    closing

                                    closing

                                    closed

                                    tim

                                    ed w

                                    ait

                                    closed

                                    3 Transport Layer 90Comp 361 Spring 2005

                                    TCP Connection Management (cont)

                                    ExampleTCP serverlifecycle

                                    Example TCP clientlifecycle

                                    3 Transport Layer 91Comp 361 Spring 2005

                                    A few special cases

                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                    3 Transport Layer 92Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 93Comp 361 Spring 2005

                                    Principles of Congestion Control

                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                    a top-10 problem

                                    3 Transport Layer 94Comp 361 Spring 2005

                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                    large delays when congestedmaximum achievable throughput

                                    3 Transport Layer 95Comp 361 Spring 2005

                                    Causescosts of congestion scenario 2

                                    one router finite buffers sender retransmission of lost packet

                                    3 Transport Layer 96Comp 361 Spring 2005

                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                    λin λout=

                                    λin λoutgtλ

                                    inλout

                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                    (c)(a) (b)

                                    3 Transport Layer 97Comp 361 Spring 2005

                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                    λin

                                    Q what happens as and increase λ

                                    in

                                    3 Transport Layer 98Comp 361 Spring 2005

                                    Causescosts of congestion scenario 3

                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                    3 Transport Layer 99Comp 361 Spring 2005

                                    Approaches towards congestion control

                                    Two broad approaches towards congestion control

                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                    Network-assisted congestion controlrouters provide feedback to end systems

                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                    3 Transport Layer 100Comp 361 Spring 2005

                                    Case study ATM ABR congestion control

                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                    RM cells returned to sender by receiver with bits intact

                                    small exception ndash see next page

                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                    sender should use available bandwidth

                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                    3 Transport Layer 101Comp 361 Spring 2005

                                    Case study ATM ABR congestion control

                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                    3 Transport Layer 102Comp 361 Spring 2005

                                    Chapter 3 outline

                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                    35 Connection-oriented transport TCP

                                    segment structurereliable data transferflow controlconnection management

                                    36 Principles of congestion control37 TCP congestion control

                                    3 Transport Layer 103Comp 361 Spring 2005

                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                    Congwin

                                    w segments each with MSS bytes sent in one RTT

                                    throughput = w MSSRTT Bytessec

                                    3 Transport Layer 104Comp 361 Spring 2005

                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                    LastByteSent-LastByteAcked le CongWin

                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                    3 Transport Layer 105Comp 361 Spring 2005

                                    TCP AIMDmultiplicative decrease additive increase increase

                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                    cut CongWin in half after loss event

                                    8 Kbytes

                                    16 Kbytes

                                    24 Kbytes

                                    time

                                    congestionwindow

                                    Long-lived TCP connection

                                    3 Transport Layer 106Comp 361 Spring 2005

                                    TCP Slow Start

                                    When connection begins CongWin = 1 MSS

                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                    available bandwidth may be gtgt MSSRTT

                                    desirable to quickly ramp up to respectable rate

                                    When connection begins increase rate exponentially fast until first loss event

                                    3 Transport Layer 107Comp 361 Spring 2005

                                    TCP Slow Start (more)

                                    When connection begins increase rate exponentially until first loss event

                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                    Summary initial rate is slow but ramps up exponentially fast

                                    Host A

                                    one segment

                                    RTT

                                    Host B

                                    time

                                    two segments

                                    four segments

                                    3 Transport Layer 108Comp 361 Spring 2005

                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                    3 Transport Layer 109Comp 361 Spring 2005

                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                    3 Transport Layer 110Comp 361 Spring 2005

                                    Summary TCP Congestion Control

                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                    3 Transport Layer 111Comp 361 Spring 2005

                                    The Big Picture

                                    3 Transport Layer 112Comp 361 Spring 2005

                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                    ACK receipt for previously unackeddata

                                    Slow Start (SS)

                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                    set state to ldquoCongestion Avoidancerdquo

                                    Resulting in a doubling of CongWin every RTT

                                    ACK receipt for previously unackeddata

                                    CongestionAvoidance (CA)

                                    CongWin = CongWin+MSS (MSSCongWin)

                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                    Loss event detected by triple duplicate ACK

                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                    Enter slow start

                                    Duplicate ACK

                                    SS or CA Increment duplicate ACK count for segment being acked

                                    CongWin and Threshold not changed

                                    3 Transport Layer 113Comp 361 Spring 2005

                                    TCP throughput

                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                    3 Transport Layer 114Comp 361 Spring 2005

                                    TCP Futures

                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                    LRTTMSSsdot221

                                    3 Transport Layer 115Comp 361 Spring 2005

                                    TCP FairnessFairness goal if K TCP sessions share same

                                    bottleneck link of bandwidth R each should have average rate of RK

                                    TCP connection 1

                                    bottleneckrouter

                                    capacity R

                                    TCP connection 2

                                    3 Transport Layer 116Comp 361 Spring 2005

                                    Why is TCP fairTwo competing sessions

                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                    R

                                    R

                                    equal bandwidth share

                                    Connection 1 throughput

                                    Conn

                                    ecti

                                    on 2

                                    thr

                                    ough

                                    p ut

                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                    3 Transport Layer 117Comp 361 Spring 2005

                                    Fairness (more)Fairness and UDP

                                    Multimedia apps often do not use TCP

                                    do not want rate throttled by congestion control

                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                    Current Research area How to keep UDP from congesting the internet

                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                    3 Transport Layer 118Comp 361 Spring 2005

                                    TCP Latency ModelingNotation assumptions

                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                    modeling slow start

                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                    3 Transport Layer 119Comp 361 Spring 2005

                                    Fixed Congestion Window (W)Two cases

                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                    3 Transport Layer 120Comp 361 Spring 2005

                                    Fixed congestion window (1)

                                    First caseWSR gt RTT + SR ACK for

                                    first segment in window returns before windowrsquos worth of data sent

                                    latency = 2RTT + OR

                                    3 Transport Layer 121Comp 361 Spring 2005

                                    Fixed congestion window (2)

                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                    3 Transport Layer 122Comp 361 Spring 2005

                                    TCP Latency Modeling Slow Start (1)

                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                    Will show that the delay for one object is

                                    RS

                                    RSRTTP

                                    RORTTLatency P )12(2 minusminus⎥⎦

                                    ⎤⎢⎣⎡ +++=

                                    where P is the number of times TCP idles at server1min minus= KQP

                                    - where Q is the number of times the server idlesif the object were of infinite size

                                    - and K is the number of windows that cover the object

                                    3 Transport Layer 123Comp 361 Spring 2005

                                    TCP Latency Modeling Slow Start (2)

                                    RTT

                                    initiate TCPconnection

                                    requestobject

                                    first window= SR

                                    second window= 2SR

                                    third window= 4SR

                                    fourth window= 8SR

                                    completetransmissionobject

                                    delivered

                                    time atclient

                                    time atserver

                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                    Server idles P=2 times

                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                    Server idles P = minK-1Q times

                                    3 Transport Layer 124Comp 361 Spring 2005

                                    TCP Latency Modeling (3)

                                    ementacknowledg receivesserver until

                                    segment send tostartsserver whenfrom time=+ RTTRS

                                    RS

                                    RSRTTPRTT

                                    RO

                                    RSRTT

                                    RSRTT

                                    RO

                                    idleTimeRTTRO

                                    P

                                    kP

                                    k

                                    P

                                    pp

                                    )12(][2

                                    ]2[2

                                    2delay

                                    1

                                    1

                                    1

                                    minusminus+++=

                                    minus+++=

                                    ++=

                                    minus

                                    =

                                    =

                                    sum

                                    sum

                                    th window after the timeidle 2 1 kRSRTT

                                    RS k =⎥⎦

                                    ⎤⎢⎣⎡ minus+

                                    +minus

                                    window kth the transmit totime2 1 =minus

                                    RSk

                                    RTT

                                    initiate TCPconnection

                                    requestobject

                                    first window= SR

                                    second window= 2SR

                                    third window= 4SR

                                    fourth window= 8SR

                                    completetransmissionobject

                                    delivered

                                    time atclient

                                    time atserver

                                    3 Transport Layer 125Comp 361 Spring 2005

                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                    How do we calculate K

                                    ⎥⎥⎤

                                    ⎢⎢⎡ +=

                                    +ge=

                                    geminus=

                                    ge+++=

                                    ge+++=minus

                                    minus

                                    )1(log

                                    )1(logmin

                                    12min

                                    222min222min

                                    2

                                    2

                                    110

                                    110

                                    SO

                                    SOkk

                                    SOk

                                    SOkOSSSkK

                                    k

                                    k

                                    k

                                    L

                                    L

                                    Calculation of Q number of idles for infinite-size objectis similar

                                    3 Transport Layer 126Comp 361 Spring 2005

                                    HTTP ModelingAssume Web page consists of

                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                    3 Transport Layer 127Comp 361 Spring 2005

                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                    02468

                                    101214161820

                                    28Kbps

                                    100Kbps

                                    1 Mbps 10Mbps

                                    non-persistent

                                    persistent

                                    parallel non-persistent

                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                    3 Transport Layer 128Comp 361 Spring 2005

                                    HTTP Response time (in seconds)

                                    0

                                    10

                                    20

                                    30

                                    40

                                    50

                                    60

                                    70

                                    28Kbps

                                    100Kbps

                                    1 Mbps 10Mbps

                                    non-persistent

                                    persistent

                                    parallel non-persistent

                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                    3 Transport Layer 129Comp 361 Spring 2005

                                    Chapter 3 Summaryprinciples behind transport layer services

                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                    instantiation and implementation in the Internet

                                    UDPTCP

                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                    • Chapter 3 Transport Layer last revised 160305
                                    • Chapter 3 outline
                                    • Transport services and protocols
                                    • Transport vs network layer
                                    • Transport-layer protocols
                                    • Chapter 3 outline
                                    • Multiplexingdemultiplexing
                                    • Multiplexingdemultiplexing
                                    • How demultiplexing works
                                    • Connectionless demultiplexing
                                    • Connectionless demux (cont)
                                    • Connection-oriented demux
                                    • Connection-oriented demux (cont)
                                    • Connection-oriented demux Threaded Web Server
                                    • Chapter 3 outline
                                    • UDP User Datagram Protocol [RFC 768]
                                    • UDP more
                                    • UDP checksum
                                    • Chapter 3 outline
                                    • Principles of Reliable data transfer
                                    • Reliable data transfer getting started
                                    • Reliable data transfer getting started
                                    • Incremental Improvements
                                    • Rdt10 reliable transfer over a reliable channel
                                    • Rdt20 channel with bit errors
                                    • rdt20 FSM specification
                                    • rdt20 operation with no errors
                                    • rdt20 error scenario
                                    • rdt20 has a fatal flaw
                                    • rdt21 sender handles garbled ACKNAKs
                                    • rdt21 receiver handles garbled ACKNAKs
                                    • rdt21 discussion
                                    • rdt22 a NAK-free protocol
                                    • rdt22 sender receiver fragments
                                    • rdt30 channels with errors and loss
                                    • rdt30 sender
                                    • rdt30 in action
                                    • rdt30 in action
                                    • Performance of rdt30
                                    • rdt30 stop-and-wait operation
                                    • Pipelined protocols
                                    • Pipelined protocols
                                    • Pipelining increased utilization
                                    • Go-Back-N
                                    • GBN Sender
                                    • GBN sender extended FSM
                                    • GBN receiver extended FSM
                                    • More on receiver
                                    • GBN inaction
                                    • Selective Repeat
                                    • Selective repeat sender receiver windows
                                    • Selective repeat
                                    • Selective repeat in action
                                    • Selective repeat dilemma
                                    • Chapter 3 outline
                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                    • More TCP Details
                                    • Even More TCP Details
                                    • TCP segment structure
                                    • TCP seq rsquos and ACKs
                                    • TCP Round Trip Time and Timeout
                                    • TCP Round Trip Time and Timeout
                                    • Example RTT estimation
                                    • TCP Round Trip Time and Timeout
                                    • Chapter 3 outline
                                    • TCP reliable data transfer
                                    • TCP sender events
                                    • TCP sender(simplified)
                                    • TCP retransmission scenarios
                                    • TCP retransmission scenarios (more)
                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                    • More on Sender Policies
                                    • Fast Retransmit
                                    • Fast retransmit algorithm
                                    • TCP GBN or Selective Repeat
                                    • Chapter 3 outline
                                    • TCP Flow Control
                                    • TCP Flow Control
                                    • TCP segment structure
                                    • TCP Flow control how it works
                                    • Technical Issue
                                    • Chapter 3 outline
                                    • TCP Connection Management
                                    • TCP Connection Management (cont)
                                    • TCP Connection Management (cont)
                                    • TCP Connection Management (cont)
                                    • TCP Connection Management (cont)
                                    • A few special cases
                                    • Chapter 3 outline
                                    • Principles of Congestion Control
                                    • Causescosts of congestion scenario 1
                                    • Causescosts of congestion scenario 2
                                    • Causescosts of congestion scenario 3
                                    • Causescosts of congestion scenario 3
                                    • Approaches towards congestion control
                                    • Case study ATM ABR congestion control
                                    • Case study ATM ABR congestion control
                                    • Chapter 3 outline
                                    • TCP Congestion Control
                                    • TCP AIMD
                                    • TCP Slow Start
                                    • TCP Slow Start (more)
                                    • Summary TCP Congestion Control
                                    • The Big Picture
                                    • TCP sender congestion control
                                    • TCP throughput
                                    • TCP Futures
                                    • TCP Fairness
                                    • Why is TCP fair
                                    • Fairness (more)
                                    • TCP Latency Modeling
                                    • Fixed Congestion Window (W)
                                    • Fixed congestion window (1)
                                    • Fixed congestion window (2)
                                    • TCP Latency Modeling Slow Start (1)
                                    • TCP Latency Modeling Slow Start (2)
                                    • TCP Latency Modeling (3)
                                    • TCP Latency Modeling (4)
                                    • HTTP Modeling
                                    • Chapter 3 Summary

                                      3 Transport Layer 19Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 20Comp 361 Spring 2005

                                      Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                                      characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                                      3 Transport Layer 21Comp 361 Spring 2005

                                      Reliable data transfer getting started

                                      sendside

                                      receiveside

                                      rdt_send() called from above (eg by app) Passed data to

                                      deliver to receiver upper layer

                                      udt_send() called by rdtto transfer packet over

                                      unreliable channel to receiver

                                      rdt_rcv() called when packet arrives on rcv-side of channel

                                      deliver_data() called by rdt to deliver data to upper

                                      3 Transport Layer 22Comp 361 Spring 2005

                                      Reliable data transfer getting startedWersquoll

                                      incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                      but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                      state1

                                      state2

                                      event causing state transitionactions taken on state transition

                                      state when in this ldquostaterdquo next state

                                      uniquely determined by next event

                                      eventactions

                                      3 Transport Layer 23Comp 361 Spring 2005

                                      Incremental Improvements

                                      rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                      rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                      rdt21 deals with corrupted ACKSNAKS

                                      rdt22 like rdt21 but does not need NAKs

                                      Rdt30 Allows packets to be lost

                                      Rdt10 reliable transfer over a reliable channel

                                      underlying channel perfectly reliableno bit errorsno loss of packets

                                      separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                      Wait for call from above packet = make_pkt(data)

                                      udt_send(packet)

                                      rdt_send(data)extract (packetdata)deliver_data(data)

                                      Wait for call from

                                      below

                                      rdt_rcv(packet)

                                      sender receiver

                                      3 Transport Layer 24Comp 361 Spring 2005

                                      3 Transport Layer 25Comp 361 Spring 2005

                                      Rdt20 channel with bit errors

                                      underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                      the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                      new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                      3 Transport Layer 26Comp 361 Spring 2005

                                      rdt20 FSM specification

                                      Wait for call from above

                                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                      udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                      udt_send(NAK)

                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                      Wait for ACK or

                                      NAK

                                      rdt_send(data)

                                      receiver

                                      Wait for call from

                                      below

                                      Λ

                                      sender

                                      3 Transport Layer 27Comp 361 Spring 2005

                                      rdt20 operation with no errors

                                      Wait for call from above

                                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                      udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                      udt_send(NAK)

                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                      Wait for ACK or

                                      NAK

                                      Wait for call from

                                      below

                                      rdt_send(data)

                                      Λ

                                      3 Transport Layer 28Comp 361 Spring 2005

                                      rdt20 error scenario

                                      Wait for call from above

                                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                      udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                      udt_send(NAK)

                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                      Wait for ACK or

                                      NAK

                                      Wait for call from

                                      below

                                      rdt_send(data)

                                      Λ

                                      3 Transport Layer 29Comp 361 Spring 2005

                                      rdt20 has a fatal flawWhat happens if ACKNAK

                                      corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                      What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                      Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                      Sender sends one packet then waits for receiver response

                                      stop and wait

                                      3 Transport Layer 30Comp 361 Spring 2005

                                      Sender whenever sender receives control message it sends a packet to receiver

                                      A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                      Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                      Note ACKNAK do not contain sequence

                                      3 Transport Layer 31Comp 361 Spring 2005

                                      rdt21 sender handles garbled ACKNAKs

                                      Wait for call 0 from

                                      above

                                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                      rdt_send(data)

                                      Wait for ACK or NAK 0 udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                      rdt_send(data)

                                      udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                      Wait forcall 1 from

                                      above

                                      Wait for ACK or NAK 1

                                      ΛΛ

                                      3 Transport Layer 32Comp 361 Spring 2005

                                      rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                      ampamp has_seq0(rcvpkt)

                                      Wait for 0 from below

                                      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                      Wait for 1 from below

                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                      3 Transport Layer 33Comp 361 Spring 2005

                                      rdt21 discussion

                                      Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                      state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                      Receivermust check if received packet is duplicate

                                      state indicates whether 0 or 1 is expected pkt seq

                                      note receiver can notknow if its last ACKNAK received OK at sender

                                      3 Transport Layer 34Comp 361 Spring 2005

                                      rdt22 a NAK-free protocol

                                      same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                      receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                      duplicate ACK at sender results in same action as NAK retransmit current pkt

                                      3 Transport Layer 35Comp 361 Spring 2005

                                      rdt22 sender receiver fragments

                                      Wait for call 0 from

                                      above

                                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                      rdt_send(data)

                                      udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                      isACK(rcvpkt1) )

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                      Wait for ACK

                                      0sender FSM

                                      fragment

                                      Wait for 0 from below

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                      has_seq1(rcvpkt))

                                      udt_send(sndpkt)receiver FSM

                                      fragment

                                      Λ

                                      3 Transport Layer 36Comp 361 Spring 2005

                                      rdt30 channels with errors and loss

                                      New assumptionunderlying channel can also lose packets (data or ACKs)

                                      checksum seq ACKs retransmissions will be of help but not enough

                                      Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                      Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                      retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                      requires countdown timer

                                      3 Transport Layer 37Comp 361 Spring 2005

                                      rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                      rdt_send(data)

                                      Wait for

                                      ACK0

                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                      Wait for call 1 from

                                      above

                                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                      rdt_send(data)

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                      stop_timerstop_timer

                                      udt_send(sndpkt)start_timer

                                      timeout

                                      udt_send(sndpkt)start_timer

                                      timeout

                                      rdt_rcv(rcvpkt)

                                      Wait for call 0from

                                      above

                                      Wait for

                                      ACK1

                                      Λrdt_rcv(rcvpkt)

                                      ΛΛ

                                      Λ

                                      3 Transport Layer 38Comp 361 Spring 2005

                                      rdt30 in action

                                      3 Transport Layer 39Comp 361 Spring 2005

                                      rdt30 in action

                                      3 Transport Layer 40Comp 361 Spring 2005

                                      Performance of rdt30

                                      rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                      L (packet length in bits)R (transmission rate bps)

                                      8kbpkt109 bsec

                                      Ttransmit = = = 8 microsec

                                      U sender =

                                      00830008

                                      = 000027 L R RTT + L R

                                      =

                                      U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                      rdt30 stop-and-wait operation

                                      first packet bit transmitted t = 0

                                      sender receiver

                                      RTT

                                      last packet bit transmitted t = L R

                                      first packet bit arriveslast packet bit arrives send ACK

                                      ACK arrives send next packet t = RTT + L R

                                      U sender =

                                      008 30008

                                      = 000027 L R RTT + L R

                                      =

                                      3 Transport Layer 41Comp 361 Spring 2005

                                      3 Transport Layer 42Comp 361 Spring 2005

                                      Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                      range of sequence numbers must be increasedbuffering at sender andor receiver

                                      3 Transport Layer 43Comp 361 Spring 2005

                                      Pipelined protocols

                                      Advantage much better bandwidth utilization than stop-and-wait

                                      Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                      Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                      Note TCP is not exactly either

                                      Pipelining increased utilization

                                      first packet bit transmitted t = 0

                                      sender receiver

                                      RTT

                                      last bit transmitted t = L R

                                      first packet bit arriveslast packet bit arrives send ACK

                                      ACK arrives send next packet t = RTT + L R

                                      last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                      U sender =

                                      02430008

                                      = 00008 3 L R RTT + L R

                                      =

                                      Increase utilizationby a factor of 3

                                      3 Transport Layer 44Comp 361 Spring 2005

                                      3 Transport Layer 45Comp 361 Spring 2005

                                      Go-Back-NSender

                                      k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                      ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                      Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                      3 Transport Layer 46Comp 361 Spring 2005

                                      GBN Sender

                                      rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                      Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                      Timeout resends ALL packets that have been sent but not yet acknowledged

                                      This is only event that triggers resend

                                      3 Transport Layer 47Comp 361 Spring 2005

                                      GBN sender extended FSMrdt_send(data)

                                      Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                      timeout

                                      if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                      start_timernextseqnum++

                                      elserefuse_data(data)

                                      base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                      stop_timerelse

                                      start_timer

                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                      base=1nextseqnum=1

                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                      Λ

                                      3 Transport Layer 48Comp 361 Spring 2005

                                      GBN receiver extended FSM

                                      Wait

                                      udt_send(sndpkt)default

                                      rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                      expectedseqnum=1sndpkt =

                                      make_pkt(0ACKchksum)

                                      Λ

                                      If expected packet receivedSend ACK and deliver packet upstairs

                                      If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                      3 Transport Layer 49Comp 361 Spring 2005

                                      More on receiver

                                      The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                      3 Transport Layer 50Comp 361 Spring 2005

                                      GBN inaction

                                      GBN is easy to code but might have performance problems

                                      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                      3 Transport Layer 51Comp 361 Spring 2005

                                      3 Transport Layer 52Comp 361 Spring 2005

                                      Selective Repeat

                                      receiver individually acknowledges all correctly received pkts

                                      buffers pkts as needed for eventual in-order delivery to upper layer

                                      sender only resends pkts for which ACK not received

                                      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                      3 Transport Layer 53Comp 361 Spring 2005

                                      Selective repeat sender receiver windows

                                      3 Transport Layer 54Comp 361 Spring 2005

                                      Selective repeat

                                      pkt n in [rcvbase rcvbase+N-1]

                                      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                      pkt n in [rcvbase-Nrcvbase-1]

                                      ACK(n) (note this is a reACK)

                                      otherwiseignore

                                      receiverdata from above

                                      if next available seq in window send pkt

                                      timeout(n)resend pkt n restart timer

                                      ACK(n) in [sendbasesendbase+N]

                                      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                      sender

                                      3 Transport Layer 55Comp 361 Spring 2005

                                      Selective repeat in action

                                      3 Transport Layer 56Comp 361 Spring 2005

                                      Selective repeatdilemma

                                      Example seq rsquos 0 1 2 3window size=3

                                      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                      Q what is relationship between seq size and window size

                                      3 Transport Layer 57Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 58Comp 361 Spring 2005

                                      TCP Overview RFCs 793 1122 1323 2018 2581

                                      full duplex databi-directional data flow in same connectionMSS maximum segment size

                                      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                      flow controlledsender will not overwhelm receiver

                                      point-to-pointone sender one receiver

                                      reliable in-order byte steam

                                      no ldquomessage boundariesrdquopipelined

                                      TCP congestion and flow control set window size

                                      send amp receive buffers

                                      socketdoor

                                      TCPsend buffer

                                      TCPreceive buffer

                                      socketdoor

                                      segment

                                      applicationwrites data

                                      applicationreads data

                                      3 Transport Layer 59Comp 361 Spring 2005

                                      More TCP DetailsMaximum Segment Size (MSS)

                                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                      Application Data + TCP Header = TCP Segment

                                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                      (again no payload)Client responds with third special segment

                                      This can contain payload

                                      3 Transport Layer 60Comp 361 Spring 2005

                                      Even More TCP Details

                                      A TCP connection between client and server creates in both client and server

                                      (i) buffers(ii) variables and

                                      (iii) a socket connection to process

                                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                      any of the network elements between the host and server

                                      3 Transport Layer 61Comp 361 Spring 2005

                                      TCP segment structure

                                      source port dest port

                                      32 bits

                                      applicationdata

                                      (variable length)

                                      sequence numberacknowledgement number

                                      Receive windowUrg data pnterchecksum

                                      FSRPAUheadlen

                                      notused

                                      Options (variable length)

                                      URG urgent data (generally not used)

                                      ACK ACK valid

                                      PSH push data now(generally not used)

                                      RST SYN FINconnection estab(setup teardown

                                      commands)

                                      bytes rcvr willingto accept

                                      Internetchecksum

                                      (as in UDP)

                                      countingby bytes of data(not segments)

                                      3 Transport Layer 62Comp 361 Spring 2005

                                      TCP seq rsquos and ACKsSeq rsquos

                                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                      ACKsseq of next byte expected from other sidecumulative ACK

                                      Q how receiver handles out-of-order segments

                                      A TCP spec doesnrsquot say - up to implementer

                                      Host BHost A

                                      Seq=42 ACK=79 data = lsquoCrsquo

                                      Seq=79 ACK=43 data = lsquoCrsquo

                                      Seq=43 ACK=80

                                      Usertypes

                                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                      back lsquoCrsquo

                                      host ACKsreceipt

                                      of echoedlsquoCrsquo

                                      timesimple telnet scenario

                                      3 Transport Layer 63Comp 361 Spring 2005

                                      TCP Round Trip Time and Timeout

                                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                      average several recent measurements not just current SampleRTT

                                      Q how to set TCP timeout valuelonger than RTT

                                      but RTT variestoo short premature timeout

                                      unnecessary retransmissions

                                      too long slow reaction to segment loss

                                      3 Transport Layer 64Comp 361 Spring 2005

                                      TCP Round Trip Time and Timeout

                                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                      3 Transport Layer 65Comp 361 Spring 2005

                                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                      100

                                      150

                                      200

                                      250

                                      300

                                      350

                                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                      time (seconnds)

                                      RTT

                                      (mill

                                      iseco

                                      nds)

                                      SampleRTT Estimated RTT

                                      3 Transport Layer 66Comp 361 Spring 2005

                                      TCP Round Trip Time and Timeout

                                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                      (typically β = 025)

                                      Then set timeout interval

                                      TimeoutInterval = EstimatedRTT + 4DevRTT

                                      3 Transport Layer 67Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 68Comp 361 Spring 2005

                                      TCP reliable data transfer

                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                      Retransmissions are triggered by

                                      timeout eventsduplicate acks

                                      Initially consider simplified TCP sender

                                      ignore duplicate acksignore flow control congestion control

                                      3 Transport Layer 69Comp 361 Spring 2005

                                      TCP sender eventsdata rcvd from app

                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                      timeoutretransmit segment that caused timeoutrestart timer

                                      Ack rcvdIf acknowledges previously unackedsegments

                                      update what is known to be ackedstart timer if there are outstanding segments

                                      TCP sender(simplified)

                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                      loop (forever) switch(event)

                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                      smallest sequence numberstart timer

                                      event ACK received with ACK field value of y if (y gt SendBase)

                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                      start timer

                                      end of loop forever

                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                      3 Transport Layer 70Comp 361 Spring 2005

                                      3 Transport Layer 71Comp 361 Spring 2005

                                      TCP retransmission scenariosHost A

                                      Seq=100 20 bytes data

                                      ACK=100

                                      timepremature timeout

                                      Host B

                                      Seq=92 8 bytes data

                                      ACK=120

                                      Seq=92 8 bytes data

                                      Seq=

                                      92 t

                                      imeo

                                      ut

                                      ACK=120

                                      Host A

                                      Seq=92 8 bytes data

                                      ACK=100

                                      loss

                                      tim

                                      eout

                                      lost ACK scenario

                                      Host B

                                      X

                                      Seq=92 8 bytes data

                                      ACK=100

                                      time

                                      SendBase= 120

                                      SendBase= 120

                                      Sendbase= 100

                                      Seq=

                                      92 t

                                      imeo

                                      utSendBase

                                      = 100

                                      3 Transport Layer 72Comp 361 Spring 2005

                                      TCP retransmission scenarios (more)Host A

                                      Seq=92 8 bytes data

                                      ACK=100

                                      loss

                                      tim

                                      eout

                                      Cumulative ACK scenario

                                      Host B

                                      X

                                      Seq=100 20 bytes data

                                      ACK=120

                                      time

                                      SendBase= 120

                                      3 Transport Layer 73Comp 361 Spring 2005

                                      TCP ACK generation [RFC 1122 RFC 2581]

                                      Event at Receiver

                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                      Arrival of segment that partially or completely fills gap

                                      TCP Receiver action

                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                      Immediately send single cumulative ACK ACKing both in-order segments

                                      Immediately send duplicate ACK indicating seq of next expected byte

                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                      3 Transport Layer 74Comp 361 Spring 2005

                                      More on Sender Policies

                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                      3 Transport Layer 75Comp 361 Spring 2005

                                      Fast Retransmit

                                      Time-out period often relatively long

                                      long delay before resending lost packet

                                      Detect lost segments via duplicate ACKs

                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                      fast retransmit resend segment before timer expires

                                      3 Transport Layer 76Comp 361 Spring 2005

                                      Fast retransmit algorithm

                                      event ACK received with ACK field value of y if (y gt SendBase)

                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                      start timer

                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                      resend segment with sequence number y

                                      a duplicate ACK for already ACKed segment

                                      fast retransmit

                                      3 Transport Layer 77Comp 361 Spring 2005

                                      TCP GBN or Selective Repeat

                                      Basic TCP looks a lot like GBN

                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                      This looks a lot like Selective Repeat

                                      TCP is a hybrid

                                      3 Transport Layer 78Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 79Comp 361 Spring 2005

                                      TCP Flow Control

                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                      3 Transport Layer 80Comp 361 Spring 2005

                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                      transmitting too muchtoo fast

                                      flow controlreceive side of TCP connection has a receive buffer

                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                      app process may be slow at reading from buffer

                                      3 Transport Layer 81Comp 361 Spring 2005

                                      TCP segment structure

                                      source port dest port

                                      32 bits

                                      applicationdata

                                      (variable length)

                                      sequence numberacknowledgement number

                                      Receive windowUrg data pnterchecksum

                                      FSRPAUheadlen

                                      notused

                                      Options (variable length)

                                      URG urgent data (generally not used)

                                      ACK ACK valid

                                      PSH push data now(generally not used)

                                      RST SYN FINconnection estab(setup teardown

                                      commands)

                                      bytes rcvr willingto accept

                                      Internetchecksum

                                      (as in UDP)

                                      countingby bytes of data(not segments)

                                      3 Transport Layer 82Comp 361 Spring 2005

                                      TCP Flow control how it works

                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                      LastByteRead]

                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                      guarantees receive buffer doesnrsquot overflow

                                      3 Transport Layer 83Comp 361 Spring 2005

                                      Technical Issue

                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                      3 Transport Layer 84Comp 361 Spring 2005

                                      Note on UDP

                                      UDP has no flow control

                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                      3 Transport Layer 85Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 86Comp 361 Spring 2005

                                      TCP Connection Management

                                      Three way handshakeStep 1 client end system sends

                                      TCP SYN control segment to server

                                      specifies client_isn the initial seq No application data

                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                      seq sbuffers flow control info (eg RcvWindow)

                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                      3 Transport Layer 87Comp 361 Spring 2005

                                      TCP Connection Management (cont)

                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                      Allocate buffersAllocates buffersCan include application data

                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                      clientConnection request (SYN=1 seq=client_isn)

                                      server

                                      Connection granted (SYN=1 server_isn

                                      ACK (SYN=0 seq=client_isn+1)

                                      ack=client_isn+1)

                                      ack=server_isn+1

                                      3 Transport Layer 88Comp 361 Spring 2005

                                      TCP Connection Management (cont)

                                      Closing a connection

                                      client closes socketclientSocketclose()

                                      Step 1 client end system sends TCP FIN control segment to server

                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                      client

                                      FIN

                                      server

                                      ACK

                                      ACK

                                      FIN

                                      close

                                      close

                                      closed

                                      tim

                                      ed w

                                      ait

                                      3 Transport Layer 89Comp 361 Spring 2005

                                      TCP Connection Management (cont)

                                      Step 3 client receives FIN replies with ACK

                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                      Closes down after timed-wait

                                      Step 4 server receives ACK Connection closed

                                      Note with small modification can handle simultaneous FINs

                                      client

                                      FIN

                                      server

                                      ACK

                                      ACK

                                      FIN

                                      closing

                                      closing

                                      closed

                                      tim

                                      ed w

                                      ait

                                      closed

                                      3 Transport Layer 90Comp 361 Spring 2005

                                      TCP Connection Management (cont)

                                      ExampleTCP serverlifecycle

                                      Example TCP clientlifecycle

                                      3 Transport Layer 91Comp 361 Spring 2005

                                      A few special cases

                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                      3 Transport Layer 92Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 93Comp 361 Spring 2005

                                      Principles of Congestion Control

                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                      a top-10 problem

                                      3 Transport Layer 94Comp 361 Spring 2005

                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                      large delays when congestedmaximum achievable throughput

                                      3 Transport Layer 95Comp 361 Spring 2005

                                      Causescosts of congestion scenario 2

                                      one router finite buffers sender retransmission of lost packet

                                      3 Transport Layer 96Comp 361 Spring 2005

                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                      λin λout=

                                      λin λoutgtλ

                                      inλout

                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                      (c)(a) (b)

                                      3 Transport Layer 97Comp 361 Spring 2005

                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                      λin

                                      Q what happens as and increase λ

                                      in

                                      3 Transport Layer 98Comp 361 Spring 2005

                                      Causescosts of congestion scenario 3

                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                      3 Transport Layer 99Comp 361 Spring 2005

                                      Approaches towards congestion control

                                      Two broad approaches towards congestion control

                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                      Network-assisted congestion controlrouters provide feedback to end systems

                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                      3 Transport Layer 100Comp 361 Spring 2005

                                      Case study ATM ABR congestion control

                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                      RM cells returned to sender by receiver with bits intact

                                      small exception ndash see next page

                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                      sender should use available bandwidth

                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                      3 Transport Layer 101Comp 361 Spring 2005

                                      Case study ATM ABR congestion control

                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                      3 Transport Layer 102Comp 361 Spring 2005

                                      Chapter 3 outline

                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                      35 Connection-oriented transport TCP

                                      segment structurereliable data transferflow controlconnection management

                                      36 Principles of congestion control37 TCP congestion control

                                      3 Transport Layer 103Comp 361 Spring 2005

                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                      Congwin

                                      w segments each with MSS bytes sent in one RTT

                                      throughput = w MSSRTT Bytessec

                                      3 Transport Layer 104Comp 361 Spring 2005

                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                      LastByteSent-LastByteAcked le CongWin

                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                      3 Transport Layer 105Comp 361 Spring 2005

                                      TCP AIMDmultiplicative decrease additive increase increase

                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                      cut CongWin in half after loss event

                                      8 Kbytes

                                      16 Kbytes

                                      24 Kbytes

                                      time

                                      congestionwindow

                                      Long-lived TCP connection

                                      3 Transport Layer 106Comp 361 Spring 2005

                                      TCP Slow Start

                                      When connection begins CongWin = 1 MSS

                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                      available bandwidth may be gtgt MSSRTT

                                      desirable to quickly ramp up to respectable rate

                                      When connection begins increase rate exponentially fast until first loss event

                                      3 Transport Layer 107Comp 361 Spring 2005

                                      TCP Slow Start (more)

                                      When connection begins increase rate exponentially until first loss event

                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                      Summary initial rate is slow but ramps up exponentially fast

                                      Host A

                                      one segment

                                      RTT

                                      Host B

                                      time

                                      two segments

                                      four segments

                                      3 Transport Layer 108Comp 361 Spring 2005

                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                      3 Transport Layer 109Comp 361 Spring 2005

                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                      3 Transport Layer 110Comp 361 Spring 2005

                                      Summary TCP Congestion Control

                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                      3 Transport Layer 111Comp 361 Spring 2005

                                      The Big Picture

                                      3 Transport Layer 112Comp 361 Spring 2005

                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                      ACK receipt for previously unackeddata

                                      Slow Start (SS)

                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                      set state to ldquoCongestion Avoidancerdquo

                                      Resulting in a doubling of CongWin every RTT

                                      ACK receipt for previously unackeddata

                                      CongestionAvoidance (CA)

                                      CongWin = CongWin+MSS (MSSCongWin)

                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                      Loss event detected by triple duplicate ACK

                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                      Enter slow start

                                      Duplicate ACK

                                      SS or CA Increment duplicate ACK count for segment being acked

                                      CongWin and Threshold not changed

                                      3 Transport Layer 113Comp 361 Spring 2005

                                      TCP throughput

                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                      3 Transport Layer 114Comp 361 Spring 2005

                                      TCP Futures

                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                      LRTTMSSsdot221

                                      3 Transport Layer 115Comp 361 Spring 2005

                                      TCP FairnessFairness goal if K TCP sessions share same

                                      bottleneck link of bandwidth R each should have average rate of RK

                                      TCP connection 1

                                      bottleneckrouter

                                      capacity R

                                      TCP connection 2

                                      3 Transport Layer 116Comp 361 Spring 2005

                                      Why is TCP fairTwo competing sessions

                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                      R

                                      R

                                      equal bandwidth share

                                      Connection 1 throughput

                                      Conn

                                      ecti

                                      on 2

                                      thr

                                      ough

                                      p ut

                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                      3 Transport Layer 117Comp 361 Spring 2005

                                      Fairness (more)Fairness and UDP

                                      Multimedia apps often do not use TCP

                                      do not want rate throttled by congestion control

                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                      Current Research area How to keep UDP from congesting the internet

                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                      3 Transport Layer 118Comp 361 Spring 2005

                                      TCP Latency ModelingNotation assumptions

                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                      modeling slow start

                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                      3 Transport Layer 119Comp 361 Spring 2005

                                      Fixed Congestion Window (W)Two cases

                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                      3 Transport Layer 120Comp 361 Spring 2005

                                      Fixed congestion window (1)

                                      First caseWSR gt RTT + SR ACK for

                                      first segment in window returns before windowrsquos worth of data sent

                                      latency = 2RTT + OR

                                      3 Transport Layer 121Comp 361 Spring 2005

                                      Fixed congestion window (2)

                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                      3 Transport Layer 122Comp 361 Spring 2005

                                      TCP Latency Modeling Slow Start (1)

                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                      Will show that the delay for one object is

                                      RS

                                      RSRTTP

                                      RORTTLatency P )12(2 minusminus⎥⎦

                                      ⎤⎢⎣⎡ +++=

                                      where P is the number of times TCP idles at server1min minus= KQP

                                      - where Q is the number of times the server idlesif the object were of infinite size

                                      - and K is the number of windows that cover the object

                                      3 Transport Layer 123Comp 361 Spring 2005

                                      TCP Latency Modeling Slow Start (2)

                                      RTT

                                      initiate TCPconnection

                                      requestobject

                                      first window= SR

                                      second window= 2SR

                                      third window= 4SR

                                      fourth window= 8SR

                                      completetransmissionobject

                                      delivered

                                      time atclient

                                      time atserver

                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                      Server idles P=2 times

                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                      Server idles P = minK-1Q times

                                      3 Transport Layer 124Comp 361 Spring 2005

                                      TCP Latency Modeling (3)

                                      ementacknowledg receivesserver until

                                      segment send tostartsserver whenfrom time=+ RTTRS

                                      RS

                                      RSRTTPRTT

                                      RO

                                      RSRTT

                                      RSRTT

                                      RO

                                      idleTimeRTTRO

                                      P

                                      kP

                                      k

                                      P

                                      pp

                                      )12(][2

                                      ]2[2

                                      2delay

                                      1

                                      1

                                      1

                                      minusminus+++=

                                      minus+++=

                                      ++=

                                      minus

                                      =

                                      =

                                      sum

                                      sum

                                      th window after the timeidle 2 1 kRSRTT

                                      RS k =⎥⎦

                                      ⎤⎢⎣⎡ minus+

                                      +minus

                                      window kth the transmit totime2 1 =minus

                                      RSk

                                      RTT

                                      initiate TCPconnection

                                      requestobject

                                      first window= SR

                                      second window= 2SR

                                      third window= 4SR

                                      fourth window= 8SR

                                      completetransmissionobject

                                      delivered

                                      time atclient

                                      time atserver

                                      3 Transport Layer 125Comp 361 Spring 2005

                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                      How do we calculate K

                                      ⎥⎥⎤

                                      ⎢⎢⎡ +=

                                      +ge=

                                      geminus=

                                      ge+++=

                                      ge+++=minus

                                      minus

                                      )1(log

                                      )1(logmin

                                      12min

                                      222min222min

                                      2

                                      2

                                      110

                                      110

                                      SO

                                      SOkk

                                      SOk

                                      SOkOSSSkK

                                      k

                                      k

                                      k

                                      L

                                      L

                                      Calculation of Q number of idles for infinite-size objectis similar

                                      3 Transport Layer 126Comp 361 Spring 2005

                                      HTTP ModelingAssume Web page consists of

                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                      3 Transport Layer 127Comp 361 Spring 2005

                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                      02468

                                      101214161820

                                      28Kbps

                                      100Kbps

                                      1 Mbps 10Mbps

                                      non-persistent

                                      persistent

                                      parallel non-persistent

                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                      3 Transport Layer 128Comp 361 Spring 2005

                                      HTTP Response time (in seconds)

                                      0

                                      10

                                      20

                                      30

                                      40

                                      50

                                      60

                                      70

                                      28Kbps

                                      100Kbps

                                      1 Mbps 10Mbps

                                      non-persistent

                                      persistent

                                      parallel non-persistent

                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                      3 Transport Layer 129Comp 361 Spring 2005

                                      Chapter 3 Summaryprinciples behind transport layer services

                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                      instantiation and implementation in the Internet

                                      UDPTCP

                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                      • Chapter 3 Transport Layer last revised 160305
                                      • Chapter 3 outline
                                      • Transport services and protocols
                                      • Transport vs network layer
                                      • Transport-layer protocols
                                      • Chapter 3 outline
                                      • Multiplexingdemultiplexing
                                      • Multiplexingdemultiplexing
                                      • How demultiplexing works
                                      • Connectionless demultiplexing
                                      • Connectionless demux (cont)
                                      • Connection-oriented demux
                                      • Connection-oriented demux (cont)
                                      • Connection-oriented demux Threaded Web Server
                                      • Chapter 3 outline
                                      • UDP User Datagram Protocol [RFC 768]
                                      • UDP more
                                      • UDP checksum
                                      • Chapter 3 outline
                                      • Principles of Reliable data transfer
                                      • Reliable data transfer getting started
                                      • Reliable data transfer getting started
                                      • Incremental Improvements
                                      • Rdt10 reliable transfer over a reliable channel
                                      • Rdt20 channel with bit errors
                                      • rdt20 FSM specification
                                      • rdt20 operation with no errors
                                      • rdt20 error scenario
                                      • rdt20 has a fatal flaw
                                      • rdt21 sender handles garbled ACKNAKs
                                      • rdt21 receiver handles garbled ACKNAKs
                                      • rdt21 discussion
                                      • rdt22 a NAK-free protocol
                                      • rdt22 sender receiver fragments
                                      • rdt30 channels with errors and loss
                                      • rdt30 sender
                                      • rdt30 in action
                                      • rdt30 in action
                                      • Performance of rdt30
                                      • rdt30 stop-and-wait operation
                                      • Pipelined protocols
                                      • Pipelined protocols
                                      • Pipelining increased utilization
                                      • Go-Back-N
                                      • GBN Sender
                                      • GBN sender extended FSM
                                      • GBN receiver extended FSM
                                      • More on receiver
                                      • GBN inaction
                                      • Selective Repeat
                                      • Selective repeat sender receiver windows
                                      • Selective repeat
                                      • Selective repeat in action
                                      • Selective repeat dilemma
                                      • Chapter 3 outline
                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                      • More TCP Details
                                      • Even More TCP Details
                                      • TCP segment structure
                                      • TCP seq rsquos and ACKs
                                      • TCP Round Trip Time and Timeout
                                      • TCP Round Trip Time and Timeout
                                      • Example RTT estimation
                                      • TCP Round Trip Time and Timeout
                                      • Chapter 3 outline
                                      • TCP reliable data transfer
                                      • TCP sender events
                                      • TCP sender(simplified)
                                      • TCP retransmission scenarios
                                      • TCP retransmission scenarios (more)
                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                      • More on Sender Policies
                                      • Fast Retransmit
                                      • Fast retransmit algorithm
                                      • TCP GBN or Selective Repeat
                                      • Chapter 3 outline
                                      • TCP Flow Control
                                      • TCP Flow Control
                                      • TCP segment structure
                                      • TCP Flow control how it works
                                      • Technical Issue
                                      • Chapter 3 outline
                                      • TCP Connection Management
                                      • TCP Connection Management (cont)
                                      • TCP Connection Management (cont)
                                      • TCP Connection Management (cont)
                                      • TCP Connection Management (cont)
                                      • A few special cases
                                      • Chapter 3 outline
                                      • Principles of Congestion Control
                                      • Causescosts of congestion scenario 1
                                      • Causescosts of congestion scenario 2
                                      • Causescosts of congestion scenario 3
                                      • Causescosts of congestion scenario 3
                                      • Approaches towards congestion control
                                      • Case study ATM ABR congestion control
                                      • Case study ATM ABR congestion control
                                      • Chapter 3 outline
                                      • TCP Congestion Control
                                      • TCP AIMD
                                      • TCP Slow Start
                                      • TCP Slow Start (more)
                                      • Summary TCP Congestion Control
                                      • The Big Picture
                                      • TCP sender congestion control
                                      • TCP throughput
                                      • TCP Futures
                                      • TCP Fairness
                                      • Why is TCP fair
                                      • Fairness (more)
                                      • TCP Latency Modeling
                                      • Fixed Congestion Window (W)
                                      • Fixed congestion window (1)
                                      • Fixed congestion window (2)
                                      • TCP Latency Modeling Slow Start (1)
                                      • TCP Latency Modeling Slow Start (2)
                                      • TCP Latency Modeling (3)
                                      • TCP Latency Modeling (4)
                                      • HTTP Modeling
                                      • Chapter 3 Summary

                                        3 Transport Layer 20Comp 361 Spring 2005

                                        Principles of Reliable data transferimportant in app transport link layerstop-10 list of important networking topics

                                        characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

                                        3 Transport Layer 21Comp 361 Spring 2005

                                        Reliable data transfer getting started

                                        sendside

                                        receiveside

                                        rdt_send() called from above (eg by app) Passed data to

                                        deliver to receiver upper layer

                                        udt_send() called by rdtto transfer packet over

                                        unreliable channel to receiver

                                        rdt_rcv() called when packet arrives on rcv-side of channel

                                        deliver_data() called by rdt to deliver data to upper

                                        3 Transport Layer 22Comp 361 Spring 2005

                                        Reliable data transfer getting startedWersquoll

                                        incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                        but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                        state1

                                        state2

                                        event causing state transitionactions taken on state transition

                                        state when in this ldquostaterdquo next state

                                        uniquely determined by next event

                                        eventactions

                                        3 Transport Layer 23Comp 361 Spring 2005

                                        Incremental Improvements

                                        rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                        rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                        rdt21 deals with corrupted ACKSNAKS

                                        rdt22 like rdt21 but does not need NAKs

                                        Rdt30 Allows packets to be lost

                                        Rdt10 reliable transfer over a reliable channel

                                        underlying channel perfectly reliableno bit errorsno loss of packets

                                        separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                        Wait for call from above packet = make_pkt(data)

                                        udt_send(packet)

                                        rdt_send(data)extract (packetdata)deliver_data(data)

                                        Wait for call from

                                        below

                                        rdt_rcv(packet)

                                        sender receiver

                                        3 Transport Layer 24Comp 361 Spring 2005

                                        3 Transport Layer 25Comp 361 Spring 2005

                                        Rdt20 channel with bit errors

                                        underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                        the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                        new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                        3 Transport Layer 26Comp 361 Spring 2005

                                        rdt20 FSM specification

                                        Wait for call from above

                                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                        udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                        udt_send(NAK)

                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                        Wait for ACK or

                                        NAK

                                        rdt_send(data)

                                        receiver

                                        Wait for call from

                                        below

                                        Λ

                                        sender

                                        3 Transport Layer 27Comp 361 Spring 2005

                                        rdt20 operation with no errors

                                        Wait for call from above

                                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                        udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                        udt_send(NAK)

                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                        Wait for ACK or

                                        NAK

                                        Wait for call from

                                        below

                                        rdt_send(data)

                                        Λ

                                        3 Transport Layer 28Comp 361 Spring 2005

                                        rdt20 error scenario

                                        Wait for call from above

                                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                        udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                        udt_send(NAK)

                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                        Wait for ACK or

                                        NAK

                                        Wait for call from

                                        below

                                        rdt_send(data)

                                        Λ

                                        3 Transport Layer 29Comp 361 Spring 2005

                                        rdt20 has a fatal flawWhat happens if ACKNAK

                                        corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                        What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                        Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                        Sender sends one packet then waits for receiver response

                                        stop and wait

                                        3 Transport Layer 30Comp 361 Spring 2005

                                        Sender whenever sender receives control message it sends a packet to receiver

                                        A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                        Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                        Note ACKNAK do not contain sequence

                                        3 Transport Layer 31Comp 361 Spring 2005

                                        rdt21 sender handles garbled ACKNAKs

                                        Wait for call 0 from

                                        above

                                        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                        rdt_send(data)

                                        Wait for ACK or NAK 0 udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                        rdt_send(data)

                                        udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                        Wait forcall 1 from

                                        above

                                        Wait for ACK or NAK 1

                                        ΛΛ

                                        3 Transport Layer 32Comp 361 Spring 2005

                                        rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                        ampamp has_seq0(rcvpkt)

                                        Wait for 0 from below

                                        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                        Wait for 1 from below

                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                        3 Transport Layer 33Comp 361 Spring 2005

                                        rdt21 discussion

                                        Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                        state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                        Receivermust check if received packet is duplicate

                                        state indicates whether 0 or 1 is expected pkt seq

                                        note receiver can notknow if its last ACKNAK received OK at sender

                                        3 Transport Layer 34Comp 361 Spring 2005

                                        rdt22 a NAK-free protocol

                                        same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                        receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                        duplicate ACK at sender results in same action as NAK retransmit current pkt

                                        3 Transport Layer 35Comp 361 Spring 2005

                                        rdt22 sender receiver fragments

                                        Wait for call 0 from

                                        above

                                        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                        rdt_send(data)

                                        udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                        isACK(rcvpkt1) )

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                        Wait for ACK

                                        0sender FSM

                                        fragment

                                        Wait for 0 from below

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                        has_seq1(rcvpkt))

                                        udt_send(sndpkt)receiver FSM

                                        fragment

                                        Λ

                                        3 Transport Layer 36Comp 361 Spring 2005

                                        rdt30 channels with errors and loss

                                        New assumptionunderlying channel can also lose packets (data or ACKs)

                                        checksum seq ACKs retransmissions will be of help but not enough

                                        Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                        Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                        retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                        requires countdown timer

                                        3 Transport Layer 37Comp 361 Spring 2005

                                        rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                        rdt_send(data)

                                        Wait for

                                        ACK0

                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                        Wait for call 1 from

                                        above

                                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                        rdt_send(data)

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                        stop_timerstop_timer

                                        udt_send(sndpkt)start_timer

                                        timeout

                                        udt_send(sndpkt)start_timer

                                        timeout

                                        rdt_rcv(rcvpkt)

                                        Wait for call 0from

                                        above

                                        Wait for

                                        ACK1

                                        Λrdt_rcv(rcvpkt)

                                        ΛΛ

                                        Λ

                                        3 Transport Layer 38Comp 361 Spring 2005

                                        rdt30 in action

                                        3 Transport Layer 39Comp 361 Spring 2005

                                        rdt30 in action

                                        3 Transport Layer 40Comp 361 Spring 2005

                                        Performance of rdt30

                                        rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                        L (packet length in bits)R (transmission rate bps)

                                        8kbpkt109 bsec

                                        Ttransmit = = = 8 microsec

                                        U sender =

                                        00830008

                                        = 000027 L R RTT + L R

                                        =

                                        U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                        rdt30 stop-and-wait operation

                                        first packet bit transmitted t = 0

                                        sender receiver

                                        RTT

                                        last packet bit transmitted t = L R

                                        first packet bit arriveslast packet bit arrives send ACK

                                        ACK arrives send next packet t = RTT + L R

                                        U sender =

                                        008 30008

                                        = 000027 L R RTT + L R

                                        =

                                        3 Transport Layer 41Comp 361 Spring 2005

                                        3 Transport Layer 42Comp 361 Spring 2005

                                        Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                        range of sequence numbers must be increasedbuffering at sender andor receiver

                                        3 Transport Layer 43Comp 361 Spring 2005

                                        Pipelined protocols

                                        Advantage much better bandwidth utilization than stop-and-wait

                                        Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                        Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                        Note TCP is not exactly either

                                        Pipelining increased utilization

                                        first packet bit transmitted t = 0

                                        sender receiver

                                        RTT

                                        last bit transmitted t = L R

                                        first packet bit arriveslast packet bit arrives send ACK

                                        ACK arrives send next packet t = RTT + L R

                                        last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                        U sender =

                                        02430008

                                        = 00008 3 L R RTT + L R

                                        =

                                        Increase utilizationby a factor of 3

                                        3 Transport Layer 44Comp 361 Spring 2005

                                        3 Transport Layer 45Comp 361 Spring 2005

                                        Go-Back-NSender

                                        k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                        ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                        Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                        3 Transport Layer 46Comp 361 Spring 2005

                                        GBN Sender

                                        rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                        Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                        Timeout resends ALL packets that have been sent but not yet acknowledged

                                        This is only event that triggers resend

                                        3 Transport Layer 47Comp 361 Spring 2005

                                        GBN sender extended FSMrdt_send(data)

                                        Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                        timeout

                                        if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                        start_timernextseqnum++

                                        elserefuse_data(data)

                                        base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                        stop_timerelse

                                        start_timer

                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                        base=1nextseqnum=1

                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                        Λ

                                        3 Transport Layer 48Comp 361 Spring 2005

                                        GBN receiver extended FSM

                                        Wait

                                        udt_send(sndpkt)default

                                        rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                        expectedseqnum=1sndpkt =

                                        make_pkt(0ACKchksum)

                                        Λ

                                        If expected packet receivedSend ACK and deliver packet upstairs

                                        If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                        3 Transport Layer 49Comp 361 Spring 2005

                                        More on receiver

                                        The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                        3 Transport Layer 50Comp 361 Spring 2005

                                        GBN inaction

                                        GBN is easy to code but might have performance problems

                                        In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                        Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                        3 Transport Layer 51Comp 361 Spring 2005

                                        3 Transport Layer 52Comp 361 Spring 2005

                                        Selective Repeat

                                        receiver individually acknowledges all correctly received pkts

                                        buffers pkts as needed for eventual in-order delivery to upper layer

                                        sender only resends pkts for which ACK not received

                                        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                        3 Transport Layer 53Comp 361 Spring 2005

                                        Selective repeat sender receiver windows

                                        3 Transport Layer 54Comp 361 Spring 2005

                                        Selective repeat

                                        pkt n in [rcvbase rcvbase+N-1]

                                        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                        pkt n in [rcvbase-Nrcvbase-1]

                                        ACK(n) (note this is a reACK)

                                        otherwiseignore

                                        receiverdata from above

                                        if next available seq in window send pkt

                                        timeout(n)resend pkt n restart timer

                                        ACK(n) in [sendbasesendbase+N]

                                        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                        sender

                                        3 Transport Layer 55Comp 361 Spring 2005

                                        Selective repeat in action

                                        3 Transport Layer 56Comp 361 Spring 2005

                                        Selective repeatdilemma

                                        Example seq rsquos 0 1 2 3window size=3

                                        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                        Q what is relationship between seq size and window size

                                        3 Transport Layer 57Comp 361 Spring 2005

                                        Chapter 3 outline

                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                        35 Connection-oriented transport TCP

                                        segment structurereliable data transferflow controlconnection management

                                        36 Principles of congestion control37 TCP congestion control

                                        3 Transport Layer 58Comp 361 Spring 2005

                                        TCP Overview RFCs 793 1122 1323 2018 2581

                                        full duplex databi-directional data flow in same connectionMSS maximum segment size

                                        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                        flow controlledsender will not overwhelm receiver

                                        point-to-pointone sender one receiver

                                        reliable in-order byte steam

                                        no ldquomessage boundariesrdquopipelined

                                        TCP congestion and flow control set window size

                                        send amp receive buffers

                                        socketdoor

                                        TCPsend buffer

                                        TCPreceive buffer

                                        socketdoor

                                        segment

                                        applicationwrites data

                                        applicationreads data

                                        3 Transport Layer 59Comp 361 Spring 2005

                                        More TCP DetailsMaximum Segment Size (MSS)

                                        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                        Application Data + TCP Header = TCP Segment

                                        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                        (again no payload)Client responds with third special segment

                                        This can contain payload

                                        3 Transport Layer 60Comp 361 Spring 2005

                                        Even More TCP Details

                                        A TCP connection between client and server creates in both client and server

                                        (i) buffers(ii) variables and

                                        (iii) a socket connection to process

                                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                        any of the network elements between the host and server

                                        3 Transport Layer 61Comp 361 Spring 2005

                                        TCP segment structure

                                        source port dest port

                                        32 bits

                                        applicationdata

                                        (variable length)

                                        sequence numberacknowledgement number

                                        Receive windowUrg data pnterchecksum

                                        FSRPAUheadlen

                                        notused

                                        Options (variable length)

                                        URG urgent data (generally not used)

                                        ACK ACK valid

                                        PSH push data now(generally not used)

                                        RST SYN FINconnection estab(setup teardown

                                        commands)

                                        bytes rcvr willingto accept

                                        Internetchecksum

                                        (as in UDP)

                                        countingby bytes of data(not segments)

                                        3 Transport Layer 62Comp 361 Spring 2005

                                        TCP seq rsquos and ACKsSeq rsquos

                                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                        ACKsseq of next byte expected from other sidecumulative ACK

                                        Q how receiver handles out-of-order segments

                                        A TCP spec doesnrsquot say - up to implementer

                                        Host BHost A

                                        Seq=42 ACK=79 data = lsquoCrsquo

                                        Seq=79 ACK=43 data = lsquoCrsquo

                                        Seq=43 ACK=80

                                        Usertypes

                                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                        back lsquoCrsquo

                                        host ACKsreceipt

                                        of echoedlsquoCrsquo

                                        timesimple telnet scenario

                                        3 Transport Layer 63Comp 361 Spring 2005

                                        TCP Round Trip Time and Timeout

                                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                        average several recent measurements not just current SampleRTT

                                        Q how to set TCP timeout valuelonger than RTT

                                        but RTT variestoo short premature timeout

                                        unnecessary retransmissions

                                        too long slow reaction to segment loss

                                        3 Transport Layer 64Comp 361 Spring 2005

                                        TCP Round Trip Time and Timeout

                                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                        3 Transport Layer 65Comp 361 Spring 2005

                                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                        100

                                        150

                                        200

                                        250

                                        300

                                        350

                                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                        time (seconnds)

                                        RTT

                                        (mill

                                        iseco

                                        nds)

                                        SampleRTT Estimated RTT

                                        3 Transport Layer 66Comp 361 Spring 2005

                                        TCP Round Trip Time and Timeout

                                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                        (typically β = 025)

                                        Then set timeout interval

                                        TimeoutInterval = EstimatedRTT + 4DevRTT

                                        3 Transport Layer 67Comp 361 Spring 2005

                                        Chapter 3 outline

                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                        35 Connection-oriented transport TCP

                                        segment structurereliable data transferflow controlconnection management

                                        36 Principles of congestion control37 TCP congestion control

                                        3 Transport Layer 68Comp 361 Spring 2005

                                        TCP reliable data transfer

                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                        Retransmissions are triggered by

                                        timeout eventsduplicate acks

                                        Initially consider simplified TCP sender

                                        ignore duplicate acksignore flow control congestion control

                                        3 Transport Layer 69Comp 361 Spring 2005

                                        TCP sender eventsdata rcvd from app

                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                        timeoutretransmit segment that caused timeoutrestart timer

                                        Ack rcvdIf acknowledges previously unackedsegments

                                        update what is known to be ackedstart timer if there are outstanding segments

                                        TCP sender(simplified)

                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                        loop (forever) switch(event)

                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                        smallest sequence numberstart timer

                                        event ACK received with ACK field value of y if (y gt SendBase)

                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                        start timer

                                        end of loop forever

                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                        3 Transport Layer 70Comp 361 Spring 2005

                                        3 Transport Layer 71Comp 361 Spring 2005

                                        TCP retransmission scenariosHost A

                                        Seq=100 20 bytes data

                                        ACK=100

                                        timepremature timeout

                                        Host B

                                        Seq=92 8 bytes data

                                        ACK=120

                                        Seq=92 8 bytes data

                                        Seq=

                                        92 t

                                        imeo

                                        ut

                                        ACK=120

                                        Host A

                                        Seq=92 8 bytes data

                                        ACK=100

                                        loss

                                        tim

                                        eout

                                        lost ACK scenario

                                        Host B

                                        X

                                        Seq=92 8 bytes data

                                        ACK=100

                                        time

                                        SendBase= 120

                                        SendBase= 120

                                        Sendbase= 100

                                        Seq=

                                        92 t

                                        imeo

                                        utSendBase

                                        = 100

                                        3 Transport Layer 72Comp 361 Spring 2005

                                        TCP retransmission scenarios (more)Host A

                                        Seq=92 8 bytes data

                                        ACK=100

                                        loss

                                        tim

                                        eout

                                        Cumulative ACK scenario

                                        Host B

                                        X

                                        Seq=100 20 bytes data

                                        ACK=120

                                        time

                                        SendBase= 120

                                        3 Transport Layer 73Comp 361 Spring 2005

                                        TCP ACK generation [RFC 1122 RFC 2581]

                                        Event at Receiver

                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                        Arrival of segment that partially or completely fills gap

                                        TCP Receiver action

                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                        Immediately send single cumulative ACK ACKing both in-order segments

                                        Immediately send duplicate ACK indicating seq of next expected byte

                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                        3 Transport Layer 74Comp 361 Spring 2005

                                        More on Sender Policies

                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                        3 Transport Layer 75Comp 361 Spring 2005

                                        Fast Retransmit

                                        Time-out period often relatively long

                                        long delay before resending lost packet

                                        Detect lost segments via duplicate ACKs

                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                        fast retransmit resend segment before timer expires

                                        3 Transport Layer 76Comp 361 Spring 2005

                                        Fast retransmit algorithm

                                        event ACK received with ACK field value of y if (y gt SendBase)

                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                        start timer

                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                        resend segment with sequence number y

                                        a duplicate ACK for already ACKed segment

                                        fast retransmit

                                        3 Transport Layer 77Comp 361 Spring 2005

                                        TCP GBN or Selective Repeat

                                        Basic TCP looks a lot like GBN

                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                        This looks a lot like Selective Repeat

                                        TCP is a hybrid

                                        3 Transport Layer 78Comp 361 Spring 2005

                                        Chapter 3 outline

                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                        35 Connection-oriented transport TCP

                                        segment structurereliable data transferflow controlconnection management

                                        36 Principles of congestion control37 TCP congestion control

                                        3 Transport Layer 79Comp 361 Spring 2005

                                        TCP Flow Control

                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                        3 Transport Layer 80Comp 361 Spring 2005

                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                        transmitting too muchtoo fast

                                        flow controlreceive side of TCP connection has a receive buffer

                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                        app process may be slow at reading from buffer

                                        3 Transport Layer 81Comp 361 Spring 2005

                                        TCP segment structure

                                        source port dest port

                                        32 bits

                                        applicationdata

                                        (variable length)

                                        sequence numberacknowledgement number

                                        Receive windowUrg data pnterchecksum

                                        FSRPAUheadlen

                                        notused

                                        Options (variable length)

                                        URG urgent data (generally not used)

                                        ACK ACK valid

                                        PSH push data now(generally not used)

                                        RST SYN FINconnection estab(setup teardown

                                        commands)

                                        bytes rcvr willingto accept

                                        Internetchecksum

                                        (as in UDP)

                                        countingby bytes of data(not segments)

                                        3 Transport Layer 82Comp 361 Spring 2005

                                        TCP Flow control how it works

                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                        LastByteRead]

                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                        guarantees receive buffer doesnrsquot overflow

                                        3 Transport Layer 83Comp 361 Spring 2005

                                        Technical Issue

                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                        3 Transport Layer 84Comp 361 Spring 2005

                                        Note on UDP

                                        UDP has no flow control

                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                        3 Transport Layer 85Comp 361 Spring 2005

                                        Chapter 3 outline

                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                        35 Connection-oriented transport TCP

                                        segment structurereliable data transferflow controlconnection management

                                        36 Principles of congestion control37 TCP congestion control

                                        3 Transport Layer 86Comp 361 Spring 2005

                                        TCP Connection Management

                                        Three way handshakeStep 1 client end system sends

                                        TCP SYN control segment to server

                                        specifies client_isn the initial seq No application data

                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                        seq sbuffers flow control info (eg RcvWindow)

                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                        3 Transport Layer 87Comp 361 Spring 2005

                                        TCP Connection Management (cont)

                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                        Allocate buffersAllocates buffersCan include application data

                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                        clientConnection request (SYN=1 seq=client_isn)

                                        server

                                        Connection granted (SYN=1 server_isn

                                        ACK (SYN=0 seq=client_isn+1)

                                        ack=client_isn+1)

                                        ack=server_isn+1

                                        3 Transport Layer 88Comp 361 Spring 2005

                                        TCP Connection Management (cont)

                                        Closing a connection

                                        client closes socketclientSocketclose()

                                        Step 1 client end system sends TCP FIN control segment to server

                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                        client

                                        FIN

                                        server

                                        ACK

                                        ACK

                                        FIN

                                        close

                                        close

                                        closed

                                        tim

                                        ed w

                                        ait

                                        3 Transport Layer 89Comp 361 Spring 2005

                                        TCP Connection Management (cont)

                                        Step 3 client receives FIN replies with ACK

                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                        Closes down after timed-wait

                                        Step 4 server receives ACK Connection closed

                                        Note with small modification can handle simultaneous FINs

                                        client

                                        FIN

                                        server

                                        ACK

                                        ACK

                                        FIN

                                        closing

                                        closing

                                        closed

                                        tim

                                        ed w

                                        ait

                                        closed

                                        3 Transport Layer 90Comp 361 Spring 2005

                                        TCP Connection Management (cont)

                                        ExampleTCP serverlifecycle

                                        Example TCP clientlifecycle

                                        3 Transport Layer 91Comp 361 Spring 2005

                                        A few special cases

                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                        3 Transport Layer 92Comp 361 Spring 2005

                                        Chapter 3 outline

                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                        35 Connection-oriented transport TCP

                                        segment structurereliable data transferflow controlconnection management

                                        36 Principles of congestion control37 TCP congestion control

                                        3 Transport Layer 93Comp 361 Spring 2005

                                        Principles of Congestion Control

                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                        a top-10 problem

                                        3 Transport Layer 94Comp 361 Spring 2005

                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                        large delays when congestedmaximum achievable throughput

                                        3 Transport Layer 95Comp 361 Spring 2005

                                        Causescosts of congestion scenario 2

                                        one router finite buffers sender retransmission of lost packet

                                        3 Transport Layer 96Comp 361 Spring 2005

                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                        λin λout=

                                        λin λoutgtλ

                                        inλout

                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                        (c)(a) (b)

                                        3 Transport Layer 97Comp 361 Spring 2005

                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                        λin

                                        Q what happens as and increase λ

                                        in

                                        3 Transport Layer 98Comp 361 Spring 2005

                                        Causescosts of congestion scenario 3

                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                        3 Transport Layer 99Comp 361 Spring 2005

                                        Approaches towards congestion control

                                        Two broad approaches towards congestion control

                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                        Network-assisted congestion controlrouters provide feedback to end systems

                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                        3 Transport Layer 100Comp 361 Spring 2005

                                        Case study ATM ABR congestion control

                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                        RM cells returned to sender by receiver with bits intact

                                        small exception ndash see next page

                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                        sender should use available bandwidth

                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                        3 Transport Layer 101Comp 361 Spring 2005

                                        Case study ATM ABR congestion control

                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                        3 Transport Layer 102Comp 361 Spring 2005

                                        Chapter 3 outline

                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                        35 Connection-oriented transport TCP

                                        segment structurereliable data transferflow controlconnection management

                                        36 Principles of congestion control37 TCP congestion control

                                        3 Transport Layer 103Comp 361 Spring 2005

                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                        Congwin

                                        w segments each with MSS bytes sent in one RTT

                                        throughput = w MSSRTT Bytessec

                                        3 Transport Layer 104Comp 361 Spring 2005

                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                        LastByteSent-LastByteAcked le CongWin

                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                        3 Transport Layer 105Comp 361 Spring 2005

                                        TCP AIMDmultiplicative decrease additive increase increase

                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                        cut CongWin in half after loss event

                                        8 Kbytes

                                        16 Kbytes

                                        24 Kbytes

                                        time

                                        congestionwindow

                                        Long-lived TCP connection

                                        3 Transport Layer 106Comp 361 Spring 2005

                                        TCP Slow Start

                                        When connection begins CongWin = 1 MSS

                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                        available bandwidth may be gtgt MSSRTT

                                        desirable to quickly ramp up to respectable rate

                                        When connection begins increase rate exponentially fast until first loss event

                                        3 Transport Layer 107Comp 361 Spring 2005

                                        TCP Slow Start (more)

                                        When connection begins increase rate exponentially until first loss event

                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                        Summary initial rate is slow but ramps up exponentially fast

                                        Host A

                                        one segment

                                        RTT

                                        Host B

                                        time

                                        two segments

                                        four segments

                                        3 Transport Layer 108Comp 361 Spring 2005

                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                        3 Transport Layer 109Comp 361 Spring 2005

                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                        3 Transport Layer 110Comp 361 Spring 2005

                                        Summary TCP Congestion Control

                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                        3 Transport Layer 111Comp 361 Spring 2005

                                        The Big Picture

                                        3 Transport Layer 112Comp 361 Spring 2005

                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                        ACK receipt for previously unackeddata

                                        Slow Start (SS)

                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                        set state to ldquoCongestion Avoidancerdquo

                                        Resulting in a doubling of CongWin every RTT

                                        ACK receipt for previously unackeddata

                                        CongestionAvoidance (CA)

                                        CongWin = CongWin+MSS (MSSCongWin)

                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                        Loss event detected by triple duplicate ACK

                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                        Enter slow start

                                        Duplicate ACK

                                        SS or CA Increment duplicate ACK count for segment being acked

                                        CongWin and Threshold not changed

                                        3 Transport Layer 113Comp 361 Spring 2005

                                        TCP throughput

                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                        3 Transport Layer 114Comp 361 Spring 2005

                                        TCP Futures

                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                        LRTTMSSsdot221

                                        3 Transport Layer 115Comp 361 Spring 2005

                                        TCP FairnessFairness goal if K TCP sessions share same

                                        bottleneck link of bandwidth R each should have average rate of RK

                                        TCP connection 1

                                        bottleneckrouter

                                        capacity R

                                        TCP connection 2

                                        3 Transport Layer 116Comp 361 Spring 2005

                                        Why is TCP fairTwo competing sessions

                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                        R

                                        R

                                        equal bandwidth share

                                        Connection 1 throughput

                                        Conn

                                        ecti

                                        on 2

                                        thr

                                        ough

                                        p ut

                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                        3 Transport Layer 117Comp 361 Spring 2005

                                        Fairness (more)Fairness and UDP

                                        Multimedia apps often do not use TCP

                                        do not want rate throttled by congestion control

                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                        Current Research area How to keep UDP from congesting the internet

                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                        3 Transport Layer 118Comp 361 Spring 2005

                                        TCP Latency ModelingNotation assumptions

                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                        modeling slow start

                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                        3 Transport Layer 119Comp 361 Spring 2005

                                        Fixed Congestion Window (W)Two cases

                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                        3 Transport Layer 120Comp 361 Spring 2005

                                        Fixed congestion window (1)

                                        First caseWSR gt RTT + SR ACK for

                                        first segment in window returns before windowrsquos worth of data sent

                                        latency = 2RTT + OR

                                        3 Transport Layer 121Comp 361 Spring 2005

                                        Fixed congestion window (2)

                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                        3 Transport Layer 122Comp 361 Spring 2005

                                        TCP Latency Modeling Slow Start (1)

                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                        Will show that the delay for one object is

                                        RS

                                        RSRTTP

                                        RORTTLatency P )12(2 minusminus⎥⎦

                                        ⎤⎢⎣⎡ +++=

                                        where P is the number of times TCP idles at server1min minus= KQP

                                        - where Q is the number of times the server idlesif the object were of infinite size

                                        - and K is the number of windows that cover the object

                                        3 Transport Layer 123Comp 361 Spring 2005

                                        TCP Latency Modeling Slow Start (2)

                                        RTT

                                        initiate TCPconnection

                                        requestobject

                                        first window= SR

                                        second window= 2SR

                                        third window= 4SR

                                        fourth window= 8SR

                                        completetransmissionobject

                                        delivered

                                        time atclient

                                        time atserver

                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                        Server idles P=2 times

                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                        Server idles P = minK-1Q times

                                        3 Transport Layer 124Comp 361 Spring 2005

                                        TCP Latency Modeling (3)

                                        ementacknowledg receivesserver until

                                        segment send tostartsserver whenfrom time=+ RTTRS

                                        RS

                                        RSRTTPRTT

                                        RO

                                        RSRTT

                                        RSRTT

                                        RO

                                        idleTimeRTTRO

                                        P

                                        kP

                                        k

                                        P

                                        pp

                                        )12(][2

                                        ]2[2

                                        2delay

                                        1

                                        1

                                        1

                                        minusminus+++=

                                        minus+++=

                                        ++=

                                        minus

                                        =

                                        =

                                        sum

                                        sum

                                        th window after the timeidle 2 1 kRSRTT

                                        RS k =⎥⎦

                                        ⎤⎢⎣⎡ minus+

                                        +minus

                                        window kth the transmit totime2 1 =minus

                                        RSk

                                        RTT

                                        initiate TCPconnection

                                        requestobject

                                        first window= SR

                                        second window= 2SR

                                        third window= 4SR

                                        fourth window= 8SR

                                        completetransmissionobject

                                        delivered

                                        time atclient

                                        time atserver

                                        3 Transport Layer 125Comp 361 Spring 2005

                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                        How do we calculate K

                                        ⎥⎥⎤

                                        ⎢⎢⎡ +=

                                        +ge=

                                        geminus=

                                        ge+++=

                                        ge+++=minus

                                        minus

                                        )1(log

                                        )1(logmin

                                        12min

                                        222min222min

                                        2

                                        2

                                        110

                                        110

                                        SO

                                        SOkk

                                        SOk

                                        SOkOSSSkK

                                        k

                                        k

                                        k

                                        L

                                        L

                                        Calculation of Q number of idles for infinite-size objectis similar

                                        3 Transport Layer 126Comp 361 Spring 2005

                                        HTTP ModelingAssume Web page consists of

                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                        3 Transport Layer 127Comp 361 Spring 2005

                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                        02468

                                        101214161820

                                        28Kbps

                                        100Kbps

                                        1 Mbps 10Mbps

                                        non-persistent

                                        persistent

                                        parallel non-persistent

                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                        3 Transport Layer 128Comp 361 Spring 2005

                                        HTTP Response time (in seconds)

                                        0

                                        10

                                        20

                                        30

                                        40

                                        50

                                        60

                                        70

                                        28Kbps

                                        100Kbps

                                        1 Mbps 10Mbps

                                        non-persistent

                                        persistent

                                        parallel non-persistent

                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                        3 Transport Layer 129Comp 361 Spring 2005

                                        Chapter 3 Summaryprinciples behind transport layer services

                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                        instantiation and implementation in the Internet

                                        UDPTCP

                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                        • Chapter 3 Transport Layer last revised 160305
                                        • Chapter 3 outline
                                        • Transport services and protocols
                                        • Transport vs network layer
                                        • Transport-layer protocols
                                        • Chapter 3 outline
                                        • Multiplexingdemultiplexing
                                        • Multiplexingdemultiplexing
                                        • How demultiplexing works
                                        • Connectionless demultiplexing
                                        • Connectionless demux (cont)
                                        • Connection-oriented demux
                                        • Connection-oriented demux (cont)
                                        • Connection-oriented demux Threaded Web Server
                                        • Chapter 3 outline
                                        • UDP User Datagram Protocol [RFC 768]
                                        • UDP more
                                        • UDP checksum
                                        • Chapter 3 outline
                                        • Principles of Reliable data transfer
                                        • Reliable data transfer getting started
                                        • Reliable data transfer getting started
                                        • Incremental Improvements
                                        • Rdt10 reliable transfer over a reliable channel
                                        • Rdt20 channel with bit errors
                                        • rdt20 FSM specification
                                        • rdt20 operation with no errors
                                        • rdt20 error scenario
                                        • rdt20 has a fatal flaw
                                        • rdt21 sender handles garbled ACKNAKs
                                        • rdt21 receiver handles garbled ACKNAKs
                                        • rdt21 discussion
                                        • rdt22 a NAK-free protocol
                                        • rdt22 sender receiver fragments
                                        • rdt30 channels with errors and loss
                                        • rdt30 sender
                                        • rdt30 in action
                                        • rdt30 in action
                                        • Performance of rdt30
                                        • rdt30 stop-and-wait operation
                                        • Pipelined protocols
                                        • Pipelined protocols
                                        • Pipelining increased utilization
                                        • Go-Back-N
                                        • GBN Sender
                                        • GBN sender extended FSM
                                        • GBN receiver extended FSM
                                        • More on receiver
                                        • GBN inaction
                                        • Selective Repeat
                                        • Selective repeat sender receiver windows
                                        • Selective repeat
                                        • Selective repeat in action
                                        • Selective repeat dilemma
                                        • Chapter 3 outline
                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                        • More TCP Details
                                        • Even More TCP Details
                                        • TCP segment structure
                                        • TCP seq rsquos and ACKs
                                        • TCP Round Trip Time and Timeout
                                        • TCP Round Trip Time and Timeout
                                        • Example RTT estimation
                                        • TCP Round Trip Time and Timeout
                                        • Chapter 3 outline
                                        • TCP reliable data transfer
                                        • TCP sender events
                                        • TCP sender(simplified)
                                        • TCP retransmission scenarios
                                        • TCP retransmission scenarios (more)
                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                        • More on Sender Policies
                                        • Fast Retransmit
                                        • Fast retransmit algorithm
                                        • TCP GBN or Selective Repeat
                                        • Chapter 3 outline
                                        • TCP Flow Control
                                        • TCP Flow Control
                                        • TCP segment structure
                                        • TCP Flow control how it works
                                        • Technical Issue
                                        • Chapter 3 outline
                                        • TCP Connection Management
                                        • TCP Connection Management (cont)
                                        • TCP Connection Management (cont)
                                        • TCP Connection Management (cont)
                                        • TCP Connection Management (cont)
                                        • A few special cases
                                        • Chapter 3 outline
                                        • Principles of Congestion Control
                                        • Causescosts of congestion scenario 1
                                        • Causescosts of congestion scenario 2
                                        • Causescosts of congestion scenario 3
                                        • Causescosts of congestion scenario 3
                                        • Approaches towards congestion control
                                        • Case study ATM ABR congestion control
                                        • Case study ATM ABR congestion control
                                        • Chapter 3 outline
                                        • TCP Congestion Control
                                        • TCP AIMD
                                        • TCP Slow Start
                                        • TCP Slow Start (more)
                                        • Summary TCP Congestion Control
                                        • The Big Picture
                                        • TCP sender congestion control
                                        • TCP throughput
                                        • TCP Futures
                                        • TCP Fairness
                                        • Why is TCP fair
                                        • Fairness (more)
                                        • TCP Latency Modeling
                                        • Fixed Congestion Window (W)
                                        • Fixed congestion window (1)
                                        • Fixed congestion window (2)
                                        • TCP Latency Modeling Slow Start (1)
                                        • TCP Latency Modeling Slow Start (2)
                                        • TCP Latency Modeling (3)
                                        • TCP Latency Modeling (4)
                                        • HTTP Modeling
                                        • Chapter 3 Summary

                                          3 Transport Layer 21Comp 361 Spring 2005

                                          Reliable data transfer getting started

                                          sendside

                                          receiveside

                                          rdt_send() called from above (eg by app) Passed data to

                                          deliver to receiver upper layer

                                          udt_send() called by rdtto transfer packet over

                                          unreliable channel to receiver

                                          rdt_rcv() called when packet arrives on rcv-side of channel

                                          deliver_data() called by rdt to deliver data to upper

                                          3 Transport Layer 22Comp 361 Spring 2005

                                          Reliable data transfer getting startedWersquoll

                                          incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                          but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                          state1

                                          state2

                                          event causing state transitionactions taken on state transition

                                          state when in this ldquostaterdquo next state

                                          uniquely determined by next event

                                          eventactions

                                          3 Transport Layer 23Comp 361 Spring 2005

                                          Incremental Improvements

                                          rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                          rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                          rdt21 deals with corrupted ACKSNAKS

                                          rdt22 like rdt21 but does not need NAKs

                                          Rdt30 Allows packets to be lost

                                          Rdt10 reliable transfer over a reliable channel

                                          underlying channel perfectly reliableno bit errorsno loss of packets

                                          separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                          Wait for call from above packet = make_pkt(data)

                                          udt_send(packet)

                                          rdt_send(data)extract (packetdata)deliver_data(data)

                                          Wait for call from

                                          below

                                          rdt_rcv(packet)

                                          sender receiver

                                          3 Transport Layer 24Comp 361 Spring 2005

                                          3 Transport Layer 25Comp 361 Spring 2005

                                          Rdt20 channel with bit errors

                                          underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                          the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                          new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                          3 Transport Layer 26Comp 361 Spring 2005

                                          rdt20 FSM specification

                                          Wait for call from above

                                          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                          udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                          udt_send(NAK)

                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                          Wait for ACK or

                                          NAK

                                          rdt_send(data)

                                          receiver

                                          Wait for call from

                                          below

                                          Λ

                                          sender

                                          3 Transport Layer 27Comp 361 Spring 2005

                                          rdt20 operation with no errors

                                          Wait for call from above

                                          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                          udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                          udt_send(NAK)

                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                          Wait for ACK or

                                          NAK

                                          Wait for call from

                                          below

                                          rdt_send(data)

                                          Λ

                                          3 Transport Layer 28Comp 361 Spring 2005

                                          rdt20 error scenario

                                          Wait for call from above

                                          snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                          extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                          rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                          udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                          udt_send(NAK)

                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                          Wait for ACK or

                                          NAK

                                          Wait for call from

                                          below

                                          rdt_send(data)

                                          Λ

                                          3 Transport Layer 29Comp 361 Spring 2005

                                          rdt20 has a fatal flawWhat happens if ACKNAK

                                          corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                          What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                          Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                          Sender sends one packet then waits for receiver response

                                          stop and wait

                                          3 Transport Layer 30Comp 361 Spring 2005

                                          Sender whenever sender receives control message it sends a packet to receiver

                                          A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                          Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                          Note ACKNAK do not contain sequence

                                          3 Transport Layer 31Comp 361 Spring 2005

                                          rdt21 sender handles garbled ACKNAKs

                                          Wait for call 0 from

                                          above

                                          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                          rdt_send(data)

                                          Wait for ACK or NAK 0 udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                          rdt_send(data)

                                          udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                          Wait forcall 1 from

                                          above

                                          Wait for ACK or NAK 1

                                          ΛΛ

                                          3 Transport Layer 32Comp 361 Spring 2005

                                          rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                          ampamp has_seq0(rcvpkt)

                                          Wait for 0 from below

                                          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                          Wait for 1 from below

                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                          3 Transport Layer 33Comp 361 Spring 2005

                                          rdt21 discussion

                                          Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                          state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                          Receivermust check if received packet is duplicate

                                          state indicates whether 0 or 1 is expected pkt seq

                                          note receiver can notknow if its last ACKNAK received OK at sender

                                          3 Transport Layer 34Comp 361 Spring 2005

                                          rdt22 a NAK-free protocol

                                          same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                          receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                          duplicate ACK at sender results in same action as NAK retransmit current pkt

                                          3 Transport Layer 35Comp 361 Spring 2005

                                          rdt22 sender receiver fragments

                                          Wait for call 0 from

                                          above

                                          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                          rdt_send(data)

                                          udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                          isACK(rcvpkt1) )

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                          Wait for ACK

                                          0sender FSM

                                          fragment

                                          Wait for 0 from below

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                          has_seq1(rcvpkt))

                                          udt_send(sndpkt)receiver FSM

                                          fragment

                                          Λ

                                          3 Transport Layer 36Comp 361 Spring 2005

                                          rdt30 channels with errors and loss

                                          New assumptionunderlying channel can also lose packets (data or ACKs)

                                          checksum seq ACKs retransmissions will be of help but not enough

                                          Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                          Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                          retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                          requires countdown timer

                                          3 Transport Layer 37Comp 361 Spring 2005

                                          rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                          rdt_send(data)

                                          Wait for

                                          ACK0

                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                          Wait for call 1 from

                                          above

                                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                          rdt_send(data)

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                          stop_timerstop_timer

                                          udt_send(sndpkt)start_timer

                                          timeout

                                          udt_send(sndpkt)start_timer

                                          timeout

                                          rdt_rcv(rcvpkt)

                                          Wait for call 0from

                                          above

                                          Wait for

                                          ACK1

                                          Λrdt_rcv(rcvpkt)

                                          ΛΛ

                                          Λ

                                          3 Transport Layer 38Comp 361 Spring 2005

                                          rdt30 in action

                                          3 Transport Layer 39Comp 361 Spring 2005

                                          rdt30 in action

                                          3 Transport Layer 40Comp 361 Spring 2005

                                          Performance of rdt30

                                          rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                          L (packet length in bits)R (transmission rate bps)

                                          8kbpkt109 bsec

                                          Ttransmit = = = 8 microsec

                                          U sender =

                                          00830008

                                          = 000027 L R RTT + L R

                                          =

                                          U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                          rdt30 stop-and-wait operation

                                          first packet bit transmitted t = 0

                                          sender receiver

                                          RTT

                                          last packet bit transmitted t = L R

                                          first packet bit arriveslast packet bit arrives send ACK

                                          ACK arrives send next packet t = RTT + L R

                                          U sender =

                                          008 30008

                                          = 000027 L R RTT + L R

                                          =

                                          3 Transport Layer 41Comp 361 Spring 2005

                                          3 Transport Layer 42Comp 361 Spring 2005

                                          Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                          range of sequence numbers must be increasedbuffering at sender andor receiver

                                          3 Transport Layer 43Comp 361 Spring 2005

                                          Pipelined protocols

                                          Advantage much better bandwidth utilization than stop-and-wait

                                          Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                          Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                          Note TCP is not exactly either

                                          Pipelining increased utilization

                                          first packet bit transmitted t = 0

                                          sender receiver

                                          RTT

                                          last bit transmitted t = L R

                                          first packet bit arriveslast packet bit arrives send ACK

                                          ACK arrives send next packet t = RTT + L R

                                          last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                          U sender =

                                          02430008

                                          = 00008 3 L R RTT + L R

                                          =

                                          Increase utilizationby a factor of 3

                                          3 Transport Layer 44Comp 361 Spring 2005

                                          3 Transport Layer 45Comp 361 Spring 2005

                                          Go-Back-NSender

                                          k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                          ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                          Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                          3 Transport Layer 46Comp 361 Spring 2005

                                          GBN Sender

                                          rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                          Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                          Timeout resends ALL packets that have been sent but not yet acknowledged

                                          This is only event that triggers resend

                                          3 Transport Layer 47Comp 361 Spring 2005

                                          GBN sender extended FSMrdt_send(data)

                                          Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                          timeout

                                          if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                          start_timernextseqnum++

                                          elserefuse_data(data)

                                          base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                          stop_timerelse

                                          start_timer

                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                          base=1nextseqnum=1

                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                          Λ

                                          3 Transport Layer 48Comp 361 Spring 2005

                                          GBN receiver extended FSM

                                          Wait

                                          udt_send(sndpkt)default

                                          rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                          expectedseqnum=1sndpkt =

                                          make_pkt(0ACKchksum)

                                          Λ

                                          If expected packet receivedSend ACK and deliver packet upstairs

                                          If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                          3 Transport Layer 49Comp 361 Spring 2005

                                          More on receiver

                                          The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                          3 Transport Layer 50Comp 361 Spring 2005

                                          GBN inaction

                                          GBN is easy to code but might have performance problems

                                          In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                          Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                          3 Transport Layer 51Comp 361 Spring 2005

                                          3 Transport Layer 52Comp 361 Spring 2005

                                          Selective Repeat

                                          receiver individually acknowledges all correctly received pkts

                                          buffers pkts as needed for eventual in-order delivery to upper layer

                                          sender only resends pkts for which ACK not received

                                          sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                          sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                          3 Transport Layer 53Comp 361 Spring 2005

                                          Selective repeat sender receiver windows

                                          3 Transport Layer 54Comp 361 Spring 2005

                                          Selective repeat

                                          pkt n in [rcvbase rcvbase+N-1]

                                          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                          pkt n in [rcvbase-Nrcvbase-1]

                                          ACK(n) (note this is a reACK)

                                          otherwiseignore

                                          receiverdata from above

                                          if next available seq in window send pkt

                                          timeout(n)resend pkt n restart timer

                                          ACK(n) in [sendbasesendbase+N]

                                          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                          sender

                                          3 Transport Layer 55Comp 361 Spring 2005

                                          Selective repeat in action

                                          3 Transport Layer 56Comp 361 Spring 2005

                                          Selective repeatdilemma

                                          Example seq rsquos 0 1 2 3window size=3

                                          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                          Q what is relationship between seq size and window size

                                          3 Transport Layer 57Comp 361 Spring 2005

                                          Chapter 3 outline

                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                          35 Connection-oriented transport TCP

                                          segment structurereliable data transferflow controlconnection management

                                          36 Principles of congestion control37 TCP congestion control

                                          3 Transport Layer 58Comp 361 Spring 2005

                                          TCP Overview RFCs 793 1122 1323 2018 2581

                                          full duplex databi-directional data flow in same connectionMSS maximum segment size

                                          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                          flow controlledsender will not overwhelm receiver

                                          point-to-pointone sender one receiver

                                          reliable in-order byte steam

                                          no ldquomessage boundariesrdquopipelined

                                          TCP congestion and flow control set window size

                                          send amp receive buffers

                                          socketdoor

                                          TCPsend buffer

                                          TCPreceive buffer

                                          socketdoor

                                          segment

                                          applicationwrites data

                                          applicationreads data

                                          3 Transport Layer 59Comp 361 Spring 2005

                                          More TCP DetailsMaximum Segment Size (MSS)

                                          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                          Application Data + TCP Header = TCP Segment

                                          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                          (again no payload)Client responds with third special segment

                                          This can contain payload

                                          3 Transport Layer 60Comp 361 Spring 2005

                                          Even More TCP Details

                                          A TCP connection between client and server creates in both client and server

                                          (i) buffers(ii) variables and

                                          (iii) a socket connection to process

                                          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                          any of the network elements between the host and server

                                          3 Transport Layer 61Comp 361 Spring 2005

                                          TCP segment structure

                                          source port dest port

                                          32 bits

                                          applicationdata

                                          (variable length)

                                          sequence numberacknowledgement number

                                          Receive windowUrg data pnterchecksum

                                          FSRPAUheadlen

                                          notused

                                          Options (variable length)

                                          URG urgent data (generally not used)

                                          ACK ACK valid

                                          PSH push data now(generally not used)

                                          RST SYN FINconnection estab(setup teardown

                                          commands)

                                          bytes rcvr willingto accept

                                          Internetchecksum

                                          (as in UDP)

                                          countingby bytes of data(not segments)

                                          3 Transport Layer 62Comp 361 Spring 2005

                                          TCP seq rsquos and ACKsSeq rsquos

                                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                          ACKsseq of next byte expected from other sidecumulative ACK

                                          Q how receiver handles out-of-order segments

                                          A TCP spec doesnrsquot say - up to implementer

                                          Host BHost A

                                          Seq=42 ACK=79 data = lsquoCrsquo

                                          Seq=79 ACK=43 data = lsquoCrsquo

                                          Seq=43 ACK=80

                                          Usertypes

                                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                          back lsquoCrsquo

                                          host ACKsreceipt

                                          of echoedlsquoCrsquo

                                          timesimple telnet scenario

                                          3 Transport Layer 63Comp 361 Spring 2005

                                          TCP Round Trip Time and Timeout

                                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                          average several recent measurements not just current SampleRTT

                                          Q how to set TCP timeout valuelonger than RTT

                                          but RTT variestoo short premature timeout

                                          unnecessary retransmissions

                                          too long slow reaction to segment loss

                                          3 Transport Layer 64Comp 361 Spring 2005

                                          TCP Round Trip Time and Timeout

                                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                          3 Transport Layer 65Comp 361 Spring 2005

                                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                          100

                                          150

                                          200

                                          250

                                          300

                                          350

                                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                          time (seconnds)

                                          RTT

                                          (mill

                                          iseco

                                          nds)

                                          SampleRTT Estimated RTT

                                          3 Transport Layer 66Comp 361 Spring 2005

                                          TCP Round Trip Time and Timeout

                                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                          (typically β = 025)

                                          Then set timeout interval

                                          TimeoutInterval = EstimatedRTT + 4DevRTT

                                          3 Transport Layer 67Comp 361 Spring 2005

                                          Chapter 3 outline

                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                          35 Connection-oriented transport TCP

                                          segment structurereliable data transferflow controlconnection management

                                          36 Principles of congestion control37 TCP congestion control

                                          3 Transport Layer 68Comp 361 Spring 2005

                                          TCP reliable data transfer

                                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                          Retransmissions are triggered by

                                          timeout eventsduplicate acks

                                          Initially consider simplified TCP sender

                                          ignore duplicate acksignore flow control congestion control

                                          3 Transport Layer 69Comp 361 Spring 2005

                                          TCP sender eventsdata rcvd from app

                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                          timeoutretransmit segment that caused timeoutrestart timer

                                          Ack rcvdIf acknowledges previously unackedsegments

                                          update what is known to be ackedstart timer if there are outstanding segments

                                          TCP sender(simplified)

                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                          loop (forever) switch(event)

                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                          smallest sequence numberstart timer

                                          event ACK received with ACK field value of y if (y gt SendBase)

                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                          start timer

                                          end of loop forever

                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                          3 Transport Layer 70Comp 361 Spring 2005

                                          3 Transport Layer 71Comp 361 Spring 2005

                                          TCP retransmission scenariosHost A

                                          Seq=100 20 bytes data

                                          ACK=100

                                          timepremature timeout

                                          Host B

                                          Seq=92 8 bytes data

                                          ACK=120

                                          Seq=92 8 bytes data

                                          Seq=

                                          92 t

                                          imeo

                                          ut

                                          ACK=120

                                          Host A

                                          Seq=92 8 bytes data

                                          ACK=100

                                          loss

                                          tim

                                          eout

                                          lost ACK scenario

                                          Host B

                                          X

                                          Seq=92 8 bytes data

                                          ACK=100

                                          time

                                          SendBase= 120

                                          SendBase= 120

                                          Sendbase= 100

                                          Seq=

                                          92 t

                                          imeo

                                          utSendBase

                                          = 100

                                          3 Transport Layer 72Comp 361 Spring 2005

                                          TCP retransmission scenarios (more)Host A

                                          Seq=92 8 bytes data

                                          ACK=100

                                          loss

                                          tim

                                          eout

                                          Cumulative ACK scenario

                                          Host B

                                          X

                                          Seq=100 20 bytes data

                                          ACK=120

                                          time

                                          SendBase= 120

                                          3 Transport Layer 73Comp 361 Spring 2005

                                          TCP ACK generation [RFC 1122 RFC 2581]

                                          Event at Receiver

                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                          Arrival of segment that partially or completely fills gap

                                          TCP Receiver action

                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                          Immediately send single cumulative ACK ACKing both in-order segments

                                          Immediately send duplicate ACK indicating seq of next expected byte

                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                          3 Transport Layer 74Comp 361 Spring 2005

                                          More on Sender Policies

                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                          3 Transport Layer 75Comp 361 Spring 2005

                                          Fast Retransmit

                                          Time-out period often relatively long

                                          long delay before resending lost packet

                                          Detect lost segments via duplicate ACKs

                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                          fast retransmit resend segment before timer expires

                                          3 Transport Layer 76Comp 361 Spring 2005

                                          Fast retransmit algorithm

                                          event ACK received with ACK field value of y if (y gt SendBase)

                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                          start timer

                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                          resend segment with sequence number y

                                          a duplicate ACK for already ACKed segment

                                          fast retransmit

                                          3 Transport Layer 77Comp 361 Spring 2005

                                          TCP GBN or Selective Repeat

                                          Basic TCP looks a lot like GBN

                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                          This looks a lot like Selective Repeat

                                          TCP is a hybrid

                                          3 Transport Layer 78Comp 361 Spring 2005

                                          Chapter 3 outline

                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                          35 Connection-oriented transport TCP

                                          segment structurereliable data transferflow controlconnection management

                                          36 Principles of congestion control37 TCP congestion control

                                          3 Transport Layer 79Comp 361 Spring 2005

                                          TCP Flow Control

                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                          3 Transport Layer 80Comp 361 Spring 2005

                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                          transmitting too muchtoo fast

                                          flow controlreceive side of TCP connection has a receive buffer

                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                          app process may be slow at reading from buffer

                                          3 Transport Layer 81Comp 361 Spring 2005

                                          TCP segment structure

                                          source port dest port

                                          32 bits

                                          applicationdata

                                          (variable length)

                                          sequence numberacknowledgement number

                                          Receive windowUrg data pnterchecksum

                                          FSRPAUheadlen

                                          notused

                                          Options (variable length)

                                          URG urgent data (generally not used)

                                          ACK ACK valid

                                          PSH push data now(generally not used)

                                          RST SYN FINconnection estab(setup teardown

                                          commands)

                                          bytes rcvr willingto accept

                                          Internetchecksum

                                          (as in UDP)

                                          countingby bytes of data(not segments)

                                          3 Transport Layer 82Comp 361 Spring 2005

                                          TCP Flow control how it works

                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                          LastByteRead]

                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                          guarantees receive buffer doesnrsquot overflow

                                          3 Transport Layer 83Comp 361 Spring 2005

                                          Technical Issue

                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                          3 Transport Layer 84Comp 361 Spring 2005

                                          Note on UDP

                                          UDP has no flow control

                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                          3 Transport Layer 85Comp 361 Spring 2005

                                          Chapter 3 outline

                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                          35 Connection-oriented transport TCP

                                          segment structurereliable data transferflow controlconnection management

                                          36 Principles of congestion control37 TCP congestion control

                                          3 Transport Layer 86Comp 361 Spring 2005

                                          TCP Connection Management

                                          Three way handshakeStep 1 client end system sends

                                          TCP SYN control segment to server

                                          specifies client_isn the initial seq No application data

                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                          seq sbuffers flow control info (eg RcvWindow)

                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                          3 Transport Layer 87Comp 361 Spring 2005

                                          TCP Connection Management (cont)

                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                          Allocate buffersAllocates buffersCan include application data

                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                          clientConnection request (SYN=1 seq=client_isn)

                                          server

                                          Connection granted (SYN=1 server_isn

                                          ACK (SYN=0 seq=client_isn+1)

                                          ack=client_isn+1)

                                          ack=server_isn+1

                                          3 Transport Layer 88Comp 361 Spring 2005

                                          TCP Connection Management (cont)

                                          Closing a connection

                                          client closes socketclientSocketclose()

                                          Step 1 client end system sends TCP FIN control segment to server

                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                          client

                                          FIN

                                          server

                                          ACK

                                          ACK

                                          FIN

                                          close

                                          close

                                          closed

                                          tim

                                          ed w

                                          ait

                                          3 Transport Layer 89Comp 361 Spring 2005

                                          TCP Connection Management (cont)

                                          Step 3 client receives FIN replies with ACK

                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                          Closes down after timed-wait

                                          Step 4 server receives ACK Connection closed

                                          Note with small modification can handle simultaneous FINs

                                          client

                                          FIN

                                          server

                                          ACK

                                          ACK

                                          FIN

                                          closing

                                          closing

                                          closed

                                          tim

                                          ed w

                                          ait

                                          closed

                                          3 Transport Layer 90Comp 361 Spring 2005

                                          TCP Connection Management (cont)

                                          ExampleTCP serverlifecycle

                                          Example TCP clientlifecycle

                                          3 Transport Layer 91Comp 361 Spring 2005

                                          A few special cases

                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                          3 Transport Layer 92Comp 361 Spring 2005

                                          Chapter 3 outline

                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                          35 Connection-oriented transport TCP

                                          segment structurereliable data transferflow controlconnection management

                                          36 Principles of congestion control37 TCP congestion control

                                          3 Transport Layer 93Comp 361 Spring 2005

                                          Principles of Congestion Control

                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                          a top-10 problem

                                          3 Transport Layer 94Comp 361 Spring 2005

                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                          large delays when congestedmaximum achievable throughput

                                          3 Transport Layer 95Comp 361 Spring 2005

                                          Causescosts of congestion scenario 2

                                          one router finite buffers sender retransmission of lost packet

                                          3 Transport Layer 96Comp 361 Spring 2005

                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                          λin λout=

                                          λin λoutgtλ

                                          inλout

                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                          (c)(a) (b)

                                          3 Transport Layer 97Comp 361 Spring 2005

                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                          λin

                                          Q what happens as and increase λ

                                          in

                                          3 Transport Layer 98Comp 361 Spring 2005

                                          Causescosts of congestion scenario 3

                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                          3 Transport Layer 99Comp 361 Spring 2005

                                          Approaches towards congestion control

                                          Two broad approaches towards congestion control

                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                          Network-assisted congestion controlrouters provide feedback to end systems

                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                          3 Transport Layer 100Comp 361 Spring 2005

                                          Case study ATM ABR congestion control

                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                          RM cells returned to sender by receiver with bits intact

                                          small exception ndash see next page

                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                          sender should use available bandwidth

                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                          3 Transport Layer 101Comp 361 Spring 2005

                                          Case study ATM ABR congestion control

                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                          3 Transport Layer 102Comp 361 Spring 2005

                                          Chapter 3 outline

                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                          35 Connection-oriented transport TCP

                                          segment structurereliable data transferflow controlconnection management

                                          36 Principles of congestion control37 TCP congestion control

                                          3 Transport Layer 103Comp 361 Spring 2005

                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                          Congwin

                                          w segments each with MSS bytes sent in one RTT

                                          throughput = w MSSRTT Bytessec

                                          3 Transport Layer 104Comp 361 Spring 2005

                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                          LastByteSent-LastByteAcked le CongWin

                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                          3 Transport Layer 105Comp 361 Spring 2005

                                          TCP AIMDmultiplicative decrease additive increase increase

                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                          cut CongWin in half after loss event

                                          8 Kbytes

                                          16 Kbytes

                                          24 Kbytes

                                          time

                                          congestionwindow

                                          Long-lived TCP connection

                                          3 Transport Layer 106Comp 361 Spring 2005

                                          TCP Slow Start

                                          When connection begins CongWin = 1 MSS

                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                          available bandwidth may be gtgt MSSRTT

                                          desirable to quickly ramp up to respectable rate

                                          When connection begins increase rate exponentially fast until first loss event

                                          3 Transport Layer 107Comp 361 Spring 2005

                                          TCP Slow Start (more)

                                          When connection begins increase rate exponentially until first loss event

                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                          Summary initial rate is slow but ramps up exponentially fast

                                          Host A

                                          one segment

                                          RTT

                                          Host B

                                          time

                                          two segments

                                          four segments

                                          3 Transport Layer 108Comp 361 Spring 2005

                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                          3 Transport Layer 109Comp 361 Spring 2005

                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                          3 Transport Layer 110Comp 361 Spring 2005

                                          Summary TCP Congestion Control

                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                          3 Transport Layer 111Comp 361 Spring 2005

                                          The Big Picture

                                          3 Transport Layer 112Comp 361 Spring 2005

                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                          ACK receipt for previously unackeddata

                                          Slow Start (SS)

                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                          set state to ldquoCongestion Avoidancerdquo

                                          Resulting in a doubling of CongWin every RTT

                                          ACK receipt for previously unackeddata

                                          CongestionAvoidance (CA)

                                          CongWin = CongWin+MSS (MSSCongWin)

                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                          Loss event detected by triple duplicate ACK

                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                          Enter slow start

                                          Duplicate ACK

                                          SS or CA Increment duplicate ACK count for segment being acked

                                          CongWin and Threshold not changed

                                          3 Transport Layer 113Comp 361 Spring 2005

                                          TCP throughput

                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                          3 Transport Layer 114Comp 361 Spring 2005

                                          TCP Futures

                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                          LRTTMSSsdot221

                                          3 Transport Layer 115Comp 361 Spring 2005

                                          TCP FairnessFairness goal if K TCP sessions share same

                                          bottleneck link of bandwidth R each should have average rate of RK

                                          TCP connection 1

                                          bottleneckrouter

                                          capacity R

                                          TCP connection 2

                                          3 Transport Layer 116Comp 361 Spring 2005

                                          Why is TCP fairTwo competing sessions

                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                          R

                                          R

                                          equal bandwidth share

                                          Connection 1 throughput

                                          Conn

                                          ecti

                                          on 2

                                          thr

                                          ough

                                          p ut

                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                          3 Transport Layer 117Comp 361 Spring 2005

                                          Fairness (more)Fairness and UDP

                                          Multimedia apps often do not use TCP

                                          do not want rate throttled by congestion control

                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                          Current Research area How to keep UDP from congesting the internet

                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                          3 Transport Layer 118Comp 361 Spring 2005

                                          TCP Latency ModelingNotation assumptions

                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                          modeling slow start

                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                          3 Transport Layer 119Comp 361 Spring 2005

                                          Fixed Congestion Window (W)Two cases

                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                          3 Transport Layer 120Comp 361 Spring 2005

                                          Fixed congestion window (1)

                                          First caseWSR gt RTT + SR ACK for

                                          first segment in window returns before windowrsquos worth of data sent

                                          latency = 2RTT + OR

                                          3 Transport Layer 121Comp 361 Spring 2005

                                          Fixed congestion window (2)

                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                          3 Transport Layer 122Comp 361 Spring 2005

                                          TCP Latency Modeling Slow Start (1)

                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                          Will show that the delay for one object is

                                          RS

                                          RSRTTP

                                          RORTTLatency P )12(2 minusminus⎥⎦

                                          ⎤⎢⎣⎡ +++=

                                          where P is the number of times TCP idles at server1min minus= KQP

                                          - where Q is the number of times the server idlesif the object were of infinite size

                                          - and K is the number of windows that cover the object

                                          3 Transport Layer 123Comp 361 Spring 2005

                                          TCP Latency Modeling Slow Start (2)

                                          RTT

                                          initiate TCPconnection

                                          requestobject

                                          first window= SR

                                          second window= 2SR

                                          third window= 4SR

                                          fourth window= 8SR

                                          completetransmissionobject

                                          delivered

                                          time atclient

                                          time atserver

                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                          Server idles P=2 times

                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                          Server idles P = minK-1Q times

                                          3 Transport Layer 124Comp 361 Spring 2005

                                          TCP Latency Modeling (3)

                                          ementacknowledg receivesserver until

                                          segment send tostartsserver whenfrom time=+ RTTRS

                                          RS

                                          RSRTTPRTT

                                          RO

                                          RSRTT

                                          RSRTT

                                          RO

                                          idleTimeRTTRO

                                          P

                                          kP

                                          k

                                          P

                                          pp

                                          )12(][2

                                          ]2[2

                                          2delay

                                          1

                                          1

                                          1

                                          minusminus+++=

                                          minus+++=

                                          ++=

                                          minus

                                          =

                                          =

                                          sum

                                          sum

                                          th window after the timeidle 2 1 kRSRTT

                                          RS k =⎥⎦

                                          ⎤⎢⎣⎡ minus+

                                          +minus

                                          window kth the transmit totime2 1 =minus

                                          RSk

                                          RTT

                                          initiate TCPconnection

                                          requestobject

                                          first window= SR

                                          second window= 2SR

                                          third window= 4SR

                                          fourth window= 8SR

                                          completetransmissionobject

                                          delivered

                                          time atclient

                                          time atserver

                                          3 Transport Layer 125Comp 361 Spring 2005

                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                          How do we calculate K

                                          ⎥⎥⎤

                                          ⎢⎢⎡ +=

                                          +ge=

                                          geminus=

                                          ge+++=

                                          ge+++=minus

                                          minus

                                          )1(log

                                          )1(logmin

                                          12min

                                          222min222min

                                          2

                                          2

                                          110

                                          110

                                          SO

                                          SOkk

                                          SOk

                                          SOkOSSSkK

                                          k

                                          k

                                          k

                                          L

                                          L

                                          Calculation of Q number of idles for infinite-size objectis similar

                                          3 Transport Layer 126Comp 361 Spring 2005

                                          HTTP ModelingAssume Web page consists of

                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                          3 Transport Layer 127Comp 361 Spring 2005

                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                          02468

                                          101214161820

                                          28Kbps

                                          100Kbps

                                          1 Mbps 10Mbps

                                          non-persistent

                                          persistent

                                          parallel non-persistent

                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                          3 Transport Layer 128Comp 361 Spring 2005

                                          HTTP Response time (in seconds)

                                          0

                                          10

                                          20

                                          30

                                          40

                                          50

                                          60

                                          70

                                          28Kbps

                                          100Kbps

                                          1 Mbps 10Mbps

                                          non-persistent

                                          persistent

                                          parallel non-persistent

                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                          3 Transport Layer 129Comp 361 Spring 2005

                                          Chapter 3 Summaryprinciples behind transport layer services

                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                          instantiation and implementation in the Internet

                                          UDPTCP

                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                          • Chapter 3 Transport Layer last revised 160305
                                          • Chapter 3 outline
                                          • Transport services and protocols
                                          • Transport vs network layer
                                          • Transport-layer protocols
                                          • Chapter 3 outline
                                          • Multiplexingdemultiplexing
                                          • Multiplexingdemultiplexing
                                          • How demultiplexing works
                                          • Connectionless demultiplexing
                                          • Connectionless demux (cont)
                                          • Connection-oriented demux
                                          • Connection-oriented demux (cont)
                                          • Connection-oriented demux Threaded Web Server
                                          • Chapter 3 outline
                                          • UDP User Datagram Protocol [RFC 768]
                                          • UDP more
                                          • UDP checksum
                                          • Chapter 3 outline
                                          • Principles of Reliable data transfer
                                          • Reliable data transfer getting started
                                          • Reliable data transfer getting started
                                          • Incremental Improvements
                                          • Rdt10 reliable transfer over a reliable channel
                                          • Rdt20 channel with bit errors
                                          • rdt20 FSM specification
                                          • rdt20 operation with no errors
                                          • rdt20 error scenario
                                          • rdt20 has a fatal flaw
                                          • rdt21 sender handles garbled ACKNAKs
                                          • rdt21 receiver handles garbled ACKNAKs
                                          • rdt21 discussion
                                          • rdt22 a NAK-free protocol
                                          • rdt22 sender receiver fragments
                                          • rdt30 channels with errors and loss
                                          • rdt30 sender
                                          • rdt30 in action
                                          • rdt30 in action
                                          • Performance of rdt30
                                          • rdt30 stop-and-wait operation
                                          • Pipelined protocols
                                          • Pipelined protocols
                                          • Pipelining increased utilization
                                          • Go-Back-N
                                          • GBN Sender
                                          • GBN sender extended FSM
                                          • GBN receiver extended FSM
                                          • More on receiver
                                          • GBN inaction
                                          • Selective Repeat
                                          • Selective repeat sender receiver windows
                                          • Selective repeat
                                          • Selective repeat in action
                                          • Selective repeat dilemma
                                          • Chapter 3 outline
                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                          • More TCP Details
                                          • Even More TCP Details
                                          • TCP segment structure
                                          • TCP seq rsquos and ACKs
                                          • TCP Round Trip Time and Timeout
                                          • TCP Round Trip Time and Timeout
                                          • Example RTT estimation
                                          • TCP Round Trip Time and Timeout
                                          • Chapter 3 outline
                                          • TCP reliable data transfer
                                          • TCP sender events
                                          • TCP sender(simplified)
                                          • TCP retransmission scenarios
                                          • TCP retransmission scenarios (more)
                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                          • More on Sender Policies
                                          • Fast Retransmit
                                          • Fast retransmit algorithm
                                          • TCP GBN or Selective Repeat
                                          • Chapter 3 outline
                                          • TCP Flow Control
                                          • TCP Flow Control
                                          • TCP segment structure
                                          • TCP Flow control how it works
                                          • Technical Issue
                                          • Chapter 3 outline
                                          • TCP Connection Management
                                          • TCP Connection Management (cont)
                                          • TCP Connection Management (cont)
                                          • TCP Connection Management (cont)
                                          • TCP Connection Management (cont)
                                          • A few special cases
                                          • Chapter 3 outline
                                          • Principles of Congestion Control
                                          • Causescosts of congestion scenario 1
                                          • Causescosts of congestion scenario 2
                                          • Causescosts of congestion scenario 3
                                          • Causescosts of congestion scenario 3
                                          • Approaches towards congestion control
                                          • Case study ATM ABR congestion control
                                          • Case study ATM ABR congestion control
                                          • Chapter 3 outline
                                          • TCP Congestion Control
                                          • TCP AIMD
                                          • TCP Slow Start
                                          • TCP Slow Start (more)
                                          • Summary TCP Congestion Control
                                          • The Big Picture
                                          • TCP sender congestion control
                                          • TCP throughput
                                          • TCP Futures
                                          • TCP Fairness
                                          • Why is TCP fair
                                          • Fairness (more)
                                          • TCP Latency Modeling
                                          • Fixed Congestion Window (W)
                                          • Fixed congestion window (1)
                                          • Fixed congestion window (2)
                                          • TCP Latency Modeling Slow Start (1)
                                          • TCP Latency Modeling Slow Start (2)
                                          • TCP Latency Modeling (3)
                                          • TCP Latency Modeling (4)
                                          • HTTP Modeling
                                          • Chapter 3 Summary

                                            3 Transport Layer 22Comp 361 Spring 2005

                                            Reliable data transfer getting startedWersquoll

                                            incrementally develop sender receiver sides of reliable data transfer protocol (rdt)consider only unidirectional data transfer

                                            but control info will flow on both directionsuse finite state machines (FSM) to specify sender receiver

                                            state1

                                            state2

                                            event causing state transitionactions taken on state transition

                                            state when in this ldquostaterdquo next state

                                            uniquely determined by next event

                                            eventactions

                                            3 Transport Layer 23Comp 361 Spring 2005

                                            Incremental Improvements

                                            rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                            rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                            rdt21 deals with corrupted ACKSNAKS

                                            rdt22 like rdt21 but does not need NAKs

                                            Rdt30 Allows packets to be lost

                                            Rdt10 reliable transfer over a reliable channel

                                            underlying channel perfectly reliableno bit errorsno loss of packets

                                            separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                            Wait for call from above packet = make_pkt(data)

                                            udt_send(packet)

                                            rdt_send(data)extract (packetdata)deliver_data(data)

                                            Wait for call from

                                            below

                                            rdt_rcv(packet)

                                            sender receiver

                                            3 Transport Layer 24Comp 361 Spring 2005

                                            3 Transport Layer 25Comp 361 Spring 2005

                                            Rdt20 channel with bit errors

                                            underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                            the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                            new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                            3 Transport Layer 26Comp 361 Spring 2005

                                            rdt20 FSM specification

                                            Wait for call from above

                                            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                            udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                            udt_send(NAK)

                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                            Wait for ACK or

                                            NAK

                                            rdt_send(data)

                                            receiver

                                            Wait for call from

                                            below

                                            Λ

                                            sender

                                            3 Transport Layer 27Comp 361 Spring 2005

                                            rdt20 operation with no errors

                                            Wait for call from above

                                            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                            udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                            udt_send(NAK)

                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                            Wait for ACK or

                                            NAK

                                            Wait for call from

                                            below

                                            rdt_send(data)

                                            Λ

                                            3 Transport Layer 28Comp 361 Spring 2005

                                            rdt20 error scenario

                                            Wait for call from above

                                            snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                            extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                            rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                            udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                            udt_send(NAK)

                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                            Wait for ACK or

                                            NAK

                                            Wait for call from

                                            below

                                            rdt_send(data)

                                            Λ

                                            3 Transport Layer 29Comp 361 Spring 2005

                                            rdt20 has a fatal flawWhat happens if ACKNAK

                                            corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                            What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                            Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                            Sender sends one packet then waits for receiver response

                                            stop and wait

                                            3 Transport Layer 30Comp 361 Spring 2005

                                            Sender whenever sender receives control message it sends a packet to receiver

                                            A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                            Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                            Note ACKNAK do not contain sequence

                                            3 Transport Layer 31Comp 361 Spring 2005

                                            rdt21 sender handles garbled ACKNAKs

                                            Wait for call 0 from

                                            above

                                            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                            rdt_send(data)

                                            Wait for ACK or NAK 0 udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                            rdt_send(data)

                                            udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                            Wait forcall 1 from

                                            above

                                            Wait for ACK or NAK 1

                                            ΛΛ

                                            3 Transport Layer 32Comp 361 Spring 2005

                                            rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                            ampamp has_seq0(rcvpkt)

                                            Wait for 0 from below

                                            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                            Wait for 1 from below

                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                            3 Transport Layer 33Comp 361 Spring 2005

                                            rdt21 discussion

                                            Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                            state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                            Receivermust check if received packet is duplicate

                                            state indicates whether 0 or 1 is expected pkt seq

                                            note receiver can notknow if its last ACKNAK received OK at sender

                                            3 Transport Layer 34Comp 361 Spring 2005

                                            rdt22 a NAK-free protocol

                                            same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                            receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                            duplicate ACK at sender results in same action as NAK retransmit current pkt

                                            3 Transport Layer 35Comp 361 Spring 2005

                                            rdt22 sender receiver fragments

                                            Wait for call 0 from

                                            above

                                            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                            rdt_send(data)

                                            udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                            isACK(rcvpkt1) )

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                            Wait for ACK

                                            0sender FSM

                                            fragment

                                            Wait for 0 from below

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                            has_seq1(rcvpkt))

                                            udt_send(sndpkt)receiver FSM

                                            fragment

                                            Λ

                                            3 Transport Layer 36Comp 361 Spring 2005

                                            rdt30 channels with errors and loss

                                            New assumptionunderlying channel can also lose packets (data or ACKs)

                                            checksum seq ACKs retransmissions will be of help but not enough

                                            Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                            Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                            retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                            requires countdown timer

                                            3 Transport Layer 37Comp 361 Spring 2005

                                            rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                            rdt_send(data)

                                            Wait for

                                            ACK0

                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                            Wait for call 1 from

                                            above

                                            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                            rdt_send(data)

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                            stop_timerstop_timer

                                            udt_send(sndpkt)start_timer

                                            timeout

                                            udt_send(sndpkt)start_timer

                                            timeout

                                            rdt_rcv(rcvpkt)

                                            Wait for call 0from

                                            above

                                            Wait for

                                            ACK1

                                            Λrdt_rcv(rcvpkt)

                                            ΛΛ

                                            Λ

                                            3 Transport Layer 38Comp 361 Spring 2005

                                            rdt30 in action

                                            3 Transport Layer 39Comp 361 Spring 2005

                                            rdt30 in action

                                            3 Transport Layer 40Comp 361 Spring 2005

                                            Performance of rdt30

                                            rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                            L (packet length in bits)R (transmission rate bps)

                                            8kbpkt109 bsec

                                            Ttransmit = = = 8 microsec

                                            U sender =

                                            00830008

                                            = 000027 L R RTT + L R

                                            =

                                            U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                            rdt30 stop-and-wait operation

                                            first packet bit transmitted t = 0

                                            sender receiver

                                            RTT

                                            last packet bit transmitted t = L R

                                            first packet bit arriveslast packet bit arrives send ACK

                                            ACK arrives send next packet t = RTT + L R

                                            U sender =

                                            008 30008

                                            = 000027 L R RTT + L R

                                            =

                                            3 Transport Layer 41Comp 361 Spring 2005

                                            3 Transport Layer 42Comp 361 Spring 2005

                                            Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                            range of sequence numbers must be increasedbuffering at sender andor receiver

                                            3 Transport Layer 43Comp 361 Spring 2005

                                            Pipelined protocols

                                            Advantage much better bandwidth utilization than stop-and-wait

                                            Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                            Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                            Note TCP is not exactly either

                                            Pipelining increased utilization

                                            first packet bit transmitted t = 0

                                            sender receiver

                                            RTT

                                            last bit transmitted t = L R

                                            first packet bit arriveslast packet bit arrives send ACK

                                            ACK arrives send next packet t = RTT + L R

                                            last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                            U sender =

                                            02430008

                                            = 00008 3 L R RTT + L R

                                            =

                                            Increase utilizationby a factor of 3

                                            3 Transport Layer 44Comp 361 Spring 2005

                                            3 Transport Layer 45Comp 361 Spring 2005

                                            Go-Back-NSender

                                            k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                            ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                            Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                            3 Transport Layer 46Comp 361 Spring 2005

                                            GBN Sender

                                            rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                            Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                            Timeout resends ALL packets that have been sent but not yet acknowledged

                                            This is only event that triggers resend

                                            3 Transport Layer 47Comp 361 Spring 2005

                                            GBN sender extended FSMrdt_send(data)

                                            Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                            timeout

                                            if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                            start_timernextseqnum++

                                            elserefuse_data(data)

                                            base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                            stop_timerelse

                                            start_timer

                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                            base=1nextseqnum=1

                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                            Λ

                                            3 Transport Layer 48Comp 361 Spring 2005

                                            GBN receiver extended FSM

                                            Wait

                                            udt_send(sndpkt)default

                                            rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                            expectedseqnum=1sndpkt =

                                            make_pkt(0ACKchksum)

                                            Λ

                                            If expected packet receivedSend ACK and deliver packet upstairs

                                            If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                            3 Transport Layer 49Comp 361 Spring 2005

                                            More on receiver

                                            The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                            3 Transport Layer 50Comp 361 Spring 2005

                                            GBN inaction

                                            GBN is easy to code but might have performance problems

                                            In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                            Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                            3 Transport Layer 51Comp 361 Spring 2005

                                            3 Transport Layer 52Comp 361 Spring 2005

                                            Selective Repeat

                                            receiver individually acknowledges all correctly received pkts

                                            buffers pkts as needed for eventual in-order delivery to upper layer

                                            sender only resends pkts for which ACK not received

                                            sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                            sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                            3 Transport Layer 53Comp 361 Spring 2005

                                            Selective repeat sender receiver windows

                                            3 Transport Layer 54Comp 361 Spring 2005

                                            Selective repeat

                                            pkt n in [rcvbase rcvbase+N-1]

                                            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                            pkt n in [rcvbase-Nrcvbase-1]

                                            ACK(n) (note this is a reACK)

                                            otherwiseignore

                                            receiverdata from above

                                            if next available seq in window send pkt

                                            timeout(n)resend pkt n restart timer

                                            ACK(n) in [sendbasesendbase+N]

                                            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                            sender

                                            3 Transport Layer 55Comp 361 Spring 2005

                                            Selective repeat in action

                                            3 Transport Layer 56Comp 361 Spring 2005

                                            Selective repeatdilemma

                                            Example seq rsquos 0 1 2 3window size=3

                                            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                            Q what is relationship between seq size and window size

                                            3 Transport Layer 57Comp 361 Spring 2005

                                            Chapter 3 outline

                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                            35 Connection-oriented transport TCP

                                            segment structurereliable data transferflow controlconnection management

                                            36 Principles of congestion control37 TCP congestion control

                                            3 Transport Layer 58Comp 361 Spring 2005

                                            TCP Overview RFCs 793 1122 1323 2018 2581

                                            full duplex databi-directional data flow in same connectionMSS maximum segment size

                                            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                            flow controlledsender will not overwhelm receiver

                                            point-to-pointone sender one receiver

                                            reliable in-order byte steam

                                            no ldquomessage boundariesrdquopipelined

                                            TCP congestion and flow control set window size

                                            send amp receive buffers

                                            socketdoor

                                            TCPsend buffer

                                            TCPreceive buffer

                                            socketdoor

                                            segment

                                            applicationwrites data

                                            applicationreads data

                                            3 Transport Layer 59Comp 361 Spring 2005

                                            More TCP DetailsMaximum Segment Size (MSS)

                                            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                            Application Data + TCP Header = TCP Segment

                                            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                            (again no payload)Client responds with third special segment

                                            This can contain payload

                                            3 Transport Layer 60Comp 361 Spring 2005

                                            Even More TCP Details

                                            A TCP connection between client and server creates in both client and server

                                            (i) buffers(ii) variables and

                                            (iii) a socket connection to process

                                            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                            any of the network elements between the host and server

                                            3 Transport Layer 61Comp 361 Spring 2005

                                            TCP segment structure

                                            source port dest port

                                            32 bits

                                            applicationdata

                                            (variable length)

                                            sequence numberacknowledgement number

                                            Receive windowUrg data pnterchecksum

                                            FSRPAUheadlen

                                            notused

                                            Options (variable length)

                                            URG urgent data (generally not used)

                                            ACK ACK valid

                                            PSH push data now(generally not used)

                                            RST SYN FINconnection estab(setup teardown

                                            commands)

                                            bytes rcvr willingto accept

                                            Internetchecksum

                                            (as in UDP)

                                            countingby bytes of data(not segments)

                                            3 Transport Layer 62Comp 361 Spring 2005

                                            TCP seq rsquos and ACKsSeq rsquos

                                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                            ACKsseq of next byte expected from other sidecumulative ACK

                                            Q how receiver handles out-of-order segments

                                            A TCP spec doesnrsquot say - up to implementer

                                            Host BHost A

                                            Seq=42 ACK=79 data = lsquoCrsquo

                                            Seq=79 ACK=43 data = lsquoCrsquo

                                            Seq=43 ACK=80

                                            Usertypes

                                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                            back lsquoCrsquo

                                            host ACKsreceipt

                                            of echoedlsquoCrsquo

                                            timesimple telnet scenario

                                            3 Transport Layer 63Comp 361 Spring 2005

                                            TCP Round Trip Time and Timeout

                                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                            average several recent measurements not just current SampleRTT

                                            Q how to set TCP timeout valuelonger than RTT

                                            but RTT variestoo short premature timeout

                                            unnecessary retransmissions

                                            too long slow reaction to segment loss

                                            3 Transport Layer 64Comp 361 Spring 2005

                                            TCP Round Trip Time and Timeout

                                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                            3 Transport Layer 65Comp 361 Spring 2005

                                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                            100

                                            150

                                            200

                                            250

                                            300

                                            350

                                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                            time (seconnds)

                                            RTT

                                            (mill

                                            iseco

                                            nds)

                                            SampleRTT Estimated RTT

                                            3 Transport Layer 66Comp 361 Spring 2005

                                            TCP Round Trip Time and Timeout

                                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                            (typically β = 025)

                                            Then set timeout interval

                                            TimeoutInterval = EstimatedRTT + 4DevRTT

                                            3 Transport Layer 67Comp 361 Spring 2005

                                            Chapter 3 outline

                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                            35 Connection-oriented transport TCP

                                            segment structurereliable data transferflow controlconnection management

                                            36 Principles of congestion control37 TCP congestion control

                                            3 Transport Layer 68Comp 361 Spring 2005

                                            TCP reliable data transfer

                                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                            Retransmissions are triggered by

                                            timeout eventsduplicate acks

                                            Initially consider simplified TCP sender

                                            ignore duplicate acksignore flow control congestion control

                                            3 Transport Layer 69Comp 361 Spring 2005

                                            TCP sender eventsdata rcvd from app

                                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                            timeoutretransmit segment that caused timeoutrestart timer

                                            Ack rcvdIf acknowledges previously unackedsegments

                                            update what is known to be ackedstart timer if there are outstanding segments

                                            TCP sender(simplified)

                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                            loop (forever) switch(event)

                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                            smallest sequence numberstart timer

                                            event ACK received with ACK field value of y if (y gt SendBase)

                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                            start timer

                                            end of loop forever

                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                            3 Transport Layer 70Comp 361 Spring 2005

                                            3 Transport Layer 71Comp 361 Spring 2005

                                            TCP retransmission scenariosHost A

                                            Seq=100 20 bytes data

                                            ACK=100

                                            timepremature timeout

                                            Host B

                                            Seq=92 8 bytes data

                                            ACK=120

                                            Seq=92 8 bytes data

                                            Seq=

                                            92 t

                                            imeo

                                            ut

                                            ACK=120

                                            Host A

                                            Seq=92 8 bytes data

                                            ACK=100

                                            loss

                                            tim

                                            eout

                                            lost ACK scenario

                                            Host B

                                            X

                                            Seq=92 8 bytes data

                                            ACK=100

                                            time

                                            SendBase= 120

                                            SendBase= 120

                                            Sendbase= 100

                                            Seq=

                                            92 t

                                            imeo

                                            utSendBase

                                            = 100

                                            3 Transport Layer 72Comp 361 Spring 2005

                                            TCP retransmission scenarios (more)Host A

                                            Seq=92 8 bytes data

                                            ACK=100

                                            loss

                                            tim

                                            eout

                                            Cumulative ACK scenario

                                            Host B

                                            X

                                            Seq=100 20 bytes data

                                            ACK=120

                                            time

                                            SendBase= 120

                                            3 Transport Layer 73Comp 361 Spring 2005

                                            TCP ACK generation [RFC 1122 RFC 2581]

                                            Event at Receiver

                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                            Arrival of segment that partially or completely fills gap

                                            TCP Receiver action

                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                            Immediately send single cumulative ACK ACKing both in-order segments

                                            Immediately send duplicate ACK indicating seq of next expected byte

                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                            3 Transport Layer 74Comp 361 Spring 2005

                                            More on Sender Policies

                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                            3 Transport Layer 75Comp 361 Spring 2005

                                            Fast Retransmit

                                            Time-out period often relatively long

                                            long delay before resending lost packet

                                            Detect lost segments via duplicate ACKs

                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                            fast retransmit resend segment before timer expires

                                            3 Transport Layer 76Comp 361 Spring 2005

                                            Fast retransmit algorithm

                                            event ACK received with ACK field value of y if (y gt SendBase)

                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                            start timer

                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                            resend segment with sequence number y

                                            a duplicate ACK for already ACKed segment

                                            fast retransmit

                                            3 Transport Layer 77Comp 361 Spring 2005

                                            TCP GBN or Selective Repeat

                                            Basic TCP looks a lot like GBN

                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                            This looks a lot like Selective Repeat

                                            TCP is a hybrid

                                            3 Transport Layer 78Comp 361 Spring 2005

                                            Chapter 3 outline

                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                            35 Connection-oriented transport TCP

                                            segment structurereliable data transferflow controlconnection management

                                            36 Principles of congestion control37 TCP congestion control

                                            3 Transport Layer 79Comp 361 Spring 2005

                                            TCP Flow Control

                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                            3 Transport Layer 80Comp 361 Spring 2005

                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                            transmitting too muchtoo fast

                                            flow controlreceive side of TCP connection has a receive buffer

                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                            app process may be slow at reading from buffer

                                            3 Transport Layer 81Comp 361 Spring 2005

                                            TCP segment structure

                                            source port dest port

                                            32 bits

                                            applicationdata

                                            (variable length)

                                            sequence numberacknowledgement number

                                            Receive windowUrg data pnterchecksum

                                            FSRPAUheadlen

                                            notused

                                            Options (variable length)

                                            URG urgent data (generally not used)

                                            ACK ACK valid

                                            PSH push data now(generally not used)

                                            RST SYN FINconnection estab(setup teardown

                                            commands)

                                            bytes rcvr willingto accept

                                            Internetchecksum

                                            (as in UDP)

                                            countingby bytes of data(not segments)

                                            3 Transport Layer 82Comp 361 Spring 2005

                                            TCP Flow control how it works

                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                            LastByteRead]

                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                            guarantees receive buffer doesnrsquot overflow

                                            3 Transport Layer 83Comp 361 Spring 2005

                                            Technical Issue

                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                            3 Transport Layer 84Comp 361 Spring 2005

                                            Note on UDP

                                            UDP has no flow control

                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                            3 Transport Layer 85Comp 361 Spring 2005

                                            Chapter 3 outline

                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                            35 Connection-oriented transport TCP

                                            segment structurereliable data transferflow controlconnection management

                                            36 Principles of congestion control37 TCP congestion control

                                            3 Transport Layer 86Comp 361 Spring 2005

                                            TCP Connection Management

                                            Three way handshakeStep 1 client end system sends

                                            TCP SYN control segment to server

                                            specifies client_isn the initial seq No application data

                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                            seq sbuffers flow control info (eg RcvWindow)

                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                            3 Transport Layer 87Comp 361 Spring 2005

                                            TCP Connection Management (cont)

                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                            Allocate buffersAllocates buffersCan include application data

                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                            clientConnection request (SYN=1 seq=client_isn)

                                            server

                                            Connection granted (SYN=1 server_isn

                                            ACK (SYN=0 seq=client_isn+1)

                                            ack=client_isn+1)

                                            ack=server_isn+1

                                            3 Transport Layer 88Comp 361 Spring 2005

                                            TCP Connection Management (cont)

                                            Closing a connection

                                            client closes socketclientSocketclose()

                                            Step 1 client end system sends TCP FIN control segment to server

                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                            client

                                            FIN

                                            server

                                            ACK

                                            ACK

                                            FIN

                                            close

                                            close

                                            closed

                                            tim

                                            ed w

                                            ait

                                            3 Transport Layer 89Comp 361 Spring 2005

                                            TCP Connection Management (cont)

                                            Step 3 client receives FIN replies with ACK

                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                            Closes down after timed-wait

                                            Step 4 server receives ACK Connection closed

                                            Note with small modification can handle simultaneous FINs

                                            client

                                            FIN

                                            server

                                            ACK

                                            ACK

                                            FIN

                                            closing

                                            closing

                                            closed

                                            tim

                                            ed w

                                            ait

                                            closed

                                            3 Transport Layer 90Comp 361 Spring 2005

                                            TCP Connection Management (cont)

                                            ExampleTCP serverlifecycle

                                            Example TCP clientlifecycle

                                            3 Transport Layer 91Comp 361 Spring 2005

                                            A few special cases

                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                            3 Transport Layer 92Comp 361 Spring 2005

                                            Chapter 3 outline

                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                            35 Connection-oriented transport TCP

                                            segment structurereliable data transferflow controlconnection management

                                            36 Principles of congestion control37 TCP congestion control

                                            3 Transport Layer 93Comp 361 Spring 2005

                                            Principles of Congestion Control

                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                            a top-10 problem

                                            3 Transport Layer 94Comp 361 Spring 2005

                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                            large delays when congestedmaximum achievable throughput

                                            3 Transport Layer 95Comp 361 Spring 2005

                                            Causescosts of congestion scenario 2

                                            one router finite buffers sender retransmission of lost packet

                                            3 Transport Layer 96Comp 361 Spring 2005

                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                            λin λout=

                                            λin λoutgtλ

                                            inλout

                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                            (c)(a) (b)

                                            3 Transport Layer 97Comp 361 Spring 2005

                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                            λin

                                            Q what happens as and increase λ

                                            in

                                            3 Transport Layer 98Comp 361 Spring 2005

                                            Causescosts of congestion scenario 3

                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                            3 Transport Layer 99Comp 361 Spring 2005

                                            Approaches towards congestion control

                                            Two broad approaches towards congestion control

                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                            Network-assisted congestion controlrouters provide feedback to end systems

                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                            3 Transport Layer 100Comp 361 Spring 2005

                                            Case study ATM ABR congestion control

                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                            RM cells returned to sender by receiver with bits intact

                                            small exception ndash see next page

                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                            sender should use available bandwidth

                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                            3 Transport Layer 101Comp 361 Spring 2005

                                            Case study ATM ABR congestion control

                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                            3 Transport Layer 102Comp 361 Spring 2005

                                            Chapter 3 outline

                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                            35 Connection-oriented transport TCP

                                            segment structurereliable data transferflow controlconnection management

                                            36 Principles of congestion control37 TCP congestion control

                                            3 Transport Layer 103Comp 361 Spring 2005

                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                            Congwin

                                            w segments each with MSS bytes sent in one RTT

                                            throughput = w MSSRTT Bytessec

                                            3 Transport Layer 104Comp 361 Spring 2005

                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                            LastByteSent-LastByteAcked le CongWin

                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                            3 Transport Layer 105Comp 361 Spring 2005

                                            TCP AIMDmultiplicative decrease additive increase increase

                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                            cut CongWin in half after loss event

                                            8 Kbytes

                                            16 Kbytes

                                            24 Kbytes

                                            time

                                            congestionwindow

                                            Long-lived TCP connection

                                            3 Transport Layer 106Comp 361 Spring 2005

                                            TCP Slow Start

                                            When connection begins CongWin = 1 MSS

                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                            available bandwidth may be gtgt MSSRTT

                                            desirable to quickly ramp up to respectable rate

                                            When connection begins increase rate exponentially fast until first loss event

                                            3 Transport Layer 107Comp 361 Spring 2005

                                            TCP Slow Start (more)

                                            When connection begins increase rate exponentially until first loss event

                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                            Summary initial rate is slow but ramps up exponentially fast

                                            Host A

                                            one segment

                                            RTT

                                            Host B

                                            time

                                            two segments

                                            four segments

                                            3 Transport Layer 108Comp 361 Spring 2005

                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                            3 Transport Layer 109Comp 361 Spring 2005

                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                            3 Transport Layer 110Comp 361 Spring 2005

                                            Summary TCP Congestion Control

                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                            3 Transport Layer 111Comp 361 Spring 2005

                                            The Big Picture

                                            3 Transport Layer 112Comp 361 Spring 2005

                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                            ACK receipt for previously unackeddata

                                            Slow Start (SS)

                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                            set state to ldquoCongestion Avoidancerdquo

                                            Resulting in a doubling of CongWin every RTT

                                            ACK receipt for previously unackeddata

                                            CongestionAvoidance (CA)

                                            CongWin = CongWin+MSS (MSSCongWin)

                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                            Loss event detected by triple duplicate ACK

                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                            Enter slow start

                                            Duplicate ACK

                                            SS or CA Increment duplicate ACK count for segment being acked

                                            CongWin and Threshold not changed

                                            3 Transport Layer 113Comp 361 Spring 2005

                                            TCP throughput

                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                            3 Transport Layer 114Comp 361 Spring 2005

                                            TCP Futures

                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                            LRTTMSSsdot221

                                            3 Transport Layer 115Comp 361 Spring 2005

                                            TCP FairnessFairness goal if K TCP sessions share same

                                            bottleneck link of bandwidth R each should have average rate of RK

                                            TCP connection 1

                                            bottleneckrouter

                                            capacity R

                                            TCP connection 2

                                            3 Transport Layer 116Comp 361 Spring 2005

                                            Why is TCP fairTwo competing sessions

                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                            R

                                            R

                                            equal bandwidth share

                                            Connection 1 throughput

                                            Conn

                                            ecti

                                            on 2

                                            thr

                                            ough

                                            p ut

                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                            3 Transport Layer 117Comp 361 Spring 2005

                                            Fairness (more)Fairness and UDP

                                            Multimedia apps often do not use TCP

                                            do not want rate throttled by congestion control

                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                            Current Research area How to keep UDP from congesting the internet

                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                            3 Transport Layer 118Comp 361 Spring 2005

                                            TCP Latency ModelingNotation assumptions

                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                            modeling slow start

                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                            3 Transport Layer 119Comp 361 Spring 2005

                                            Fixed Congestion Window (W)Two cases

                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                            3 Transport Layer 120Comp 361 Spring 2005

                                            Fixed congestion window (1)

                                            First caseWSR gt RTT + SR ACK for

                                            first segment in window returns before windowrsquos worth of data sent

                                            latency = 2RTT + OR

                                            3 Transport Layer 121Comp 361 Spring 2005

                                            Fixed congestion window (2)

                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                            3 Transport Layer 122Comp 361 Spring 2005

                                            TCP Latency Modeling Slow Start (1)

                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                            Will show that the delay for one object is

                                            RS

                                            RSRTTP

                                            RORTTLatency P )12(2 minusminus⎥⎦

                                            ⎤⎢⎣⎡ +++=

                                            where P is the number of times TCP idles at server1min minus= KQP

                                            - where Q is the number of times the server idlesif the object were of infinite size

                                            - and K is the number of windows that cover the object

                                            3 Transport Layer 123Comp 361 Spring 2005

                                            TCP Latency Modeling Slow Start (2)

                                            RTT

                                            initiate TCPconnection

                                            requestobject

                                            first window= SR

                                            second window= 2SR

                                            third window= 4SR

                                            fourth window= 8SR

                                            completetransmissionobject

                                            delivered

                                            time atclient

                                            time atserver

                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                            Server idles P=2 times

                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                            Server idles P = minK-1Q times

                                            3 Transport Layer 124Comp 361 Spring 2005

                                            TCP Latency Modeling (3)

                                            ementacknowledg receivesserver until

                                            segment send tostartsserver whenfrom time=+ RTTRS

                                            RS

                                            RSRTTPRTT

                                            RO

                                            RSRTT

                                            RSRTT

                                            RO

                                            idleTimeRTTRO

                                            P

                                            kP

                                            k

                                            P

                                            pp

                                            )12(][2

                                            ]2[2

                                            2delay

                                            1

                                            1

                                            1

                                            minusminus+++=

                                            minus+++=

                                            ++=

                                            minus

                                            =

                                            =

                                            sum

                                            sum

                                            th window after the timeidle 2 1 kRSRTT

                                            RS k =⎥⎦

                                            ⎤⎢⎣⎡ minus+

                                            +minus

                                            window kth the transmit totime2 1 =minus

                                            RSk

                                            RTT

                                            initiate TCPconnection

                                            requestobject

                                            first window= SR

                                            second window= 2SR

                                            third window= 4SR

                                            fourth window= 8SR

                                            completetransmissionobject

                                            delivered

                                            time atclient

                                            time atserver

                                            3 Transport Layer 125Comp 361 Spring 2005

                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                            How do we calculate K

                                            ⎥⎥⎤

                                            ⎢⎢⎡ +=

                                            +ge=

                                            geminus=

                                            ge+++=

                                            ge+++=minus

                                            minus

                                            )1(log

                                            )1(logmin

                                            12min

                                            222min222min

                                            2

                                            2

                                            110

                                            110

                                            SO

                                            SOkk

                                            SOk

                                            SOkOSSSkK

                                            k

                                            k

                                            k

                                            L

                                            L

                                            Calculation of Q number of idles for infinite-size objectis similar

                                            3 Transport Layer 126Comp 361 Spring 2005

                                            HTTP ModelingAssume Web page consists of

                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                            3 Transport Layer 127Comp 361 Spring 2005

                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                            02468

                                            101214161820

                                            28Kbps

                                            100Kbps

                                            1 Mbps 10Mbps

                                            non-persistent

                                            persistent

                                            parallel non-persistent

                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                            3 Transport Layer 128Comp 361 Spring 2005

                                            HTTP Response time (in seconds)

                                            0

                                            10

                                            20

                                            30

                                            40

                                            50

                                            60

                                            70

                                            28Kbps

                                            100Kbps

                                            1 Mbps 10Mbps

                                            non-persistent

                                            persistent

                                            parallel non-persistent

                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                            3 Transport Layer 129Comp 361 Spring 2005

                                            Chapter 3 Summaryprinciples behind transport layer services

                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                            instantiation and implementation in the Internet

                                            UDPTCP

                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                            • Chapter 3 Transport Layer last revised 160305
                                            • Chapter 3 outline
                                            • Transport services and protocols
                                            • Transport vs network layer
                                            • Transport-layer protocols
                                            • Chapter 3 outline
                                            • Multiplexingdemultiplexing
                                            • Multiplexingdemultiplexing
                                            • How demultiplexing works
                                            • Connectionless demultiplexing
                                            • Connectionless demux (cont)
                                            • Connection-oriented demux
                                            • Connection-oriented demux (cont)
                                            • Connection-oriented demux Threaded Web Server
                                            • Chapter 3 outline
                                            • UDP User Datagram Protocol [RFC 768]
                                            • UDP more
                                            • UDP checksum
                                            • Chapter 3 outline
                                            • Principles of Reliable data transfer
                                            • Reliable data transfer getting started
                                            • Reliable data transfer getting started
                                            • Incremental Improvements
                                            • Rdt10 reliable transfer over a reliable channel
                                            • Rdt20 channel with bit errors
                                            • rdt20 FSM specification
                                            • rdt20 operation with no errors
                                            • rdt20 error scenario
                                            • rdt20 has a fatal flaw
                                            • rdt21 sender handles garbled ACKNAKs
                                            • rdt21 receiver handles garbled ACKNAKs
                                            • rdt21 discussion
                                            • rdt22 a NAK-free protocol
                                            • rdt22 sender receiver fragments
                                            • rdt30 channels with errors and loss
                                            • rdt30 sender
                                            • rdt30 in action
                                            • rdt30 in action
                                            • Performance of rdt30
                                            • rdt30 stop-and-wait operation
                                            • Pipelined protocols
                                            • Pipelined protocols
                                            • Pipelining increased utilization
                                            • Go-Back-N
                                            • GBN Sender
                                            • GBN sender extended FSM
                                            • GBN receiver extended FSM
                                            • More on receiver
                                            • GBN inaction
                                            • Selective Repeat
                                            • Selective repeat sender receiver windows
                                            • Selective repeat
                                            • Selective repeat in action
                                            • Selective repeat dilemma
                                            • Chapter 3 outline
                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                            • More TCP Details
                                            • Even More TCP Details
                                            • TCP segment structure
                                            • TCP seq rsquos and ACKs
                                            • TCP Round Trip Time and Timeout
                                            • TCP Round Trip Time and Timeout
                                            • Example RTT estimation
                                            • TCP Round Trip Time and Timeout
                                            • Chapter 3 outline
                                            • TCP reliable data transfer
                                            • TCP sender events
                                            • TCP sender(simplified)
                                            • TCP retransmission scenarios
                                            • TCP retransmission scenarios (more)
                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                            • More on Sender Policies
                                            • Fast Retransmit
                                            • Fast retransmit algorithm
                                            • TCP GBN or Selective Repeat
                                            • Chapter 3 outline
                                            • TCP Flow Control
                                            • TCP Flow Control
                                            • TCP segment structure
                                            • TCP Flow control how it works
                                            • Technical Issue
                                            • Chapter 3 outline
                                            • TCP Connection Management
                                            • TCP Connection Management (cont)
                                            • TCP Connection Management (cont)
                                            • TCP Connection Management (cont)
                                            • TCP Connection Management (cont)
                                            • A few special cases
                                            • Chapter 3 outline
                                            • Principles of Congestion Control
                                            • Causescosts of congestion scenario 1
                                            • Causescosts of congestion scenario 2
                                            • Causescosts of congestion scenario 3
                                            • Causescosts of congestion scenario 3
                                            • Approaches towards congestion control
                                            • Case study ATM ABR congestion control
                                            • Case study ATM ABR congestion control
                                            • Chapter 3 outline
                                            • TCP Congestion Control
                                            • TCP AIMD
                                            • TCP Slow Start
                                            • TCP Slow Start (more)
                                            • Summary TCP Congestion Control
                                            • The Big Picture
                                            • TCP sender congestion control
                                            • TCP throughput
                                            • TCP Futures
                                            • TCP Fairness
                                            • Why is TCP fair
                                            • Fairness (more)
                                            • TCP Latency Modeling
                                            • Fixed Congestion Window (W)
                                            • Fixed congestion window (1)
                                            • Fixed congestion window (2)
                                            • TCP Latency Modeling Slow Start (1)
                                            • TCP Latency Modeling Slow Start (2)
                                            • TCP Latency Modeling (3)
                                            • TCP Latency Modeling (4)
                                            • HTTP Modeling
                                            • Chapter 3 Summary

                                              3 Transport Layer 23Comp 361 Spring 2005

                                              Incremental Improvements

                                              rdt10 assumes every packet sent arrives and no errors introduced in transmission

                                              rdt20 assumes every packet sent arrives but some errors (bit flips) can occur within a packet Introduces concept of ACK and NAK

                                              rdt21 deals with corrupted ACKSNAKS

                                              rdt22 like rdt21 but does not need NAKs

                                              Rdt30 Allows packets to be lost

                                              Rdt10 reliable transfer over a reliable channel

                                              underlying channel perfectly reliableno bit errorsno loss of packets

                                              separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                              Wait for call from above packet = make_pkt(data)

                                              udt_send(packet)

                                              rdt_send(data)extract (packetdata)deliver_data(data)

                                              Wait for call from

                                              below

                                              rdt_rcv(packet)

                                              sender receiver

                                              3 Transport Layer 24Comp 361 Spring 2005

                                              3 Transport Layer 25Comp 361 Spring 2005

                                              Rdt20 channel with bit errors

                                              underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                              the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                              new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                              3 Transport Layer 26Comp 361 Spring 2005

                                              rdt20 FSM specification

                                              Wait for call from above

                                              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                              udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                              udt_send(NAK)

                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                              Wait for ACK or

                                              NAK

                                              rdt_send(data)

                                              receiver

                                              Wait for call from

                                              below

                                              Λ

                                              sender

                                              3 Transport Layer 27Comp 361 Spring 2005

                                              rdt20 operation with no errors

                                              Wait for call from above

                                              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                              udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                              udt_send(NAK)

                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                              Wait for ACK or

                                              NAK

                                              Wait for call from

                                              below

                                              rdt_send(data)

                                              Λ

                                              3 Transport Layer 28Comp 361 Spring 2005

                                              rdt20 error scenario

                                              Wait for call from above

                                              snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                              extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                              rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                              udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                              udt_send(NAK)

                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                              Wait for ACK or

                                              NAK

                                              Wait for call from

                                              below

                                              rdt_send(data)

                                              Λ

                                              3 Transport Layer 29Comp 361 Spring 2005

                                              rdt20 has a fatal flawWhat happens if ACKNAK

                                              corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                              What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                              Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                              Sender sends one packet then waits for receiver response

                                              stop and wait

                                              3 Transport Layer 30Comp 361 Spring 2005

                                              Sender whenever sender receives control message it sends a packet to receiver

                                              A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                              Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                              Note ACKNAK do not contain sequence

                                              3 Transport Layer 31Comp 361 Spring 2005

                                              rdt21 sender handles garbled ACKNAKs

                                              Wait for call 0 from

                                              above

                                              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                              rdt_send(data)

                                              Wait for ACK or NAK 0 udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                              rdt_send(data)

                                              udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                              Wait forcall 1 from

                                              above

                                              Wait for ACK or NAK 1

                                              ΛΛ

                                              3 Transport Layer 32Comp 361 Spring 2005

                                              rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                              ampamp has_seq0(rcvpkt)

                                              Wait for 0 from below

                                              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                              Wait for 1 from below

                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                              3 Transport Layer 33Comp 361 Spring 2005

                                              rdt21 discussion

                                              Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                              state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                              Receivermust check if received packet is duplicate

                                              state indicates whether 0 or 1 is expected pkt seq

                                              note receiver can notknow if its last ACKNAK received OK at sender

                                              3 Transport Layer 34Comp 361 Spring 2005

                                              rdt22 a NAK-free protocol

                                              same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                              receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                              duplicate ACK at sender results in same action as NAK retransmit current pkt

                                              3 Transport Layer 35Comp 361 Spring 2005

                                              rdt22 sender receiver fragments

                                              Wait for call 0 from

                                              above

                                              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                              rdt_send(data)

                                              udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                              isACK(rcvpkt1) )

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                              Wait for ACK

                                              0sender FSM

                                              fragment

                                              Wait for 0 from below

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                              has_seq1(rcvpkt))

                                              udt_send(sndpkt)receiver FSM

                                              fragment

                                              Λ

                                              3 Transport Layer 36Comp 361 Spring 2005

                                              rdt30 channels with errors and loss

                                              New assumptionunderlying channel can also lose packets (data or ACKs)

                                              checksum seq ACKs retransmissions will be of help but not enough

                                              Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                              Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                              retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                              requires countdown timer

                                              3 Transport Layer 37Comp 361 Spring 2005

                                              rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                              rdt_send(data)

                                              Wait for

                                              ACK0

                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                              Wait for call 1 from

                                              above

                                              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                              rdt_send(data)

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                              stop_timerstop_timer

                                              udt_send(sndpkt)start_timer

                                              timeout

                                              udt_send(sndpkt)start_timer

                                              timeout

                                              rdt_rcv(rcvpkt)

                                              Wait for call 0from

                                              above

                                              Wait for

                                              ACK1

                                              Λrdt_rcv(rcvpkt)

                                              ΛΛ

                                              Λ

                                              3 Transport Layer 38Comp 361 Spring 2005

                                              rdt30 in action

                                              3 Transport Layer 39Comp 361 Spring 2005

                                              rdt30 in action

                                              3 Transport Layer 40Comp 361 Spring 2005

                                              Performance of rdt30

                                              rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                              L (packet length in bits)R (transmission rate bps)

                                              8kbpkt109 bsec

                                              Ttransmit = = = 8 microsec

                                              U sender =

                                              00830008

                                              = 000027 L R RTT + L R

                                              =

                                              U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                              rdt30 stop-and-wait operation

                                              first packet bit transmitted t = 0

                                              sender receiver

                                              RTT

                                              last packet bit transmitted t = L R

                                              first packet bit arriveslast packet bit arrives send ACK

                                              ACK arrives send next packet t = RTT + L R

                                              U sender =

                                              008 30008

                                              = 000027 L R RTT + L R

                                              =

                                              3 Transport Layer 41Comp 361 Spring 2005

                                              3 Transport Layer 42Comp 361 Spring 2005

                                              Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                              range of sequence numbers must be increasedbuffering at sender andor receiver

                                              3 Transport Layer 43Comp 361 Spring 2005

                                              Pipelined protocols

                                              Advantage much better bandwidth utilization than stop-and-wait

                                              Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                              Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                              Note TCP is not exactly either

                                              Pipelining increased utilization

                                              first packet bit transmitted t = 0

                                              sender receiver

                                              RTT

                                              last bit transmitted t = L R

                                              first packet bit arriveslast packet bit arrives send ACK

                                              ACK arrives send next packet t = RTT + L R

                                              last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                              U sender =

                                              02430008

                                              = 00008 3 L R RTT + L R

                                              =

                                              Increase utilizationby a factor of 3

                                              3 Transport Layer 44Comp 361 Spring 2005

                                              3 Transport Layer 45Comp 361 Spring 2005

                                              Go-Back-NSender

                                              k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                              ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                              Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                              3 Transport Layer 46Comp 361 Spring 2005

                                              GBN Sender

                                              rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                              Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                              Timeout resends ALL packets that have been sent but not yet acknowledged

                                              This is only event that triggers resend

                                              3 Transport Layer 47Comp 361 Spring 2005

                                              GBN sender extended FSMrdt_send(data)

                                              Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                              timeout

                                              if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                              start_timernextseqnum++

                                              elserefuse_data(data)

                                              base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                              stop_timerelse

                                              start_timer

                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                              base=1nextseqnum=1

                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                              Λ

                                              3 Transport Layer 48Comp 361 Spring 2005

                                              GBN receiver extended FSM

                                              Wait

                                              udt_send(sndpkt)default

                                              rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                              expectedseqnum=1sndpkt =

                                              make_pkt(0ACKchksum)

                                              Λ

                                              If expected packet receivedSend ACK and deliver packet upstairs

                                              If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                              3 Transport Layer 49Comp 361 Spring 2005

                                              More on receiver

                                              The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                              3 Transport Layer 50Comp 361 Spring 2005

                                              GBN inaction

                                              GBN is easy to code but might have performance problems

                                              In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                              Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                              3 Transport Layer 51Comp 361 Spring 2005

                                              3 Transport Layer 52Comp 361 Spring 2005

                                              Selective Repeat

                                              receiver individually acknowledges all correctly received pkts

                                              buffers pkts as needed for eventual in-order delivery to upper layer

                                              sender only resends pkts for which ACK not received

                                              sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                              sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                              3 Transport Layer 53Comp 361 Spring 2005

                                              Selective repeat sender receiver windows

                                              3 Transport Layer 54Comp 361 Spring 2005

                                              Selective repeat

                                              pkt n in [rcvbase rcvbase+N-1]

                                              send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                              pkt n in [rcvbase-Nrcvbase-1]

                                              ACK(n) (note this is a reACK)

                                              otherwiseignore

                                              receiverdata from above

                                              if next available seq in window send pkt

                                              timeout(n)resend pkt n restart timer

                                              ACK(n) in [sendbasesendbase+N]

                                              mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                              sender

                                              3 Transport Layer 55Comp 361 Spring 2005

                                              Selective repeat in action

                                              3 Transport Layer 56Comp 361 Spring 2005

                                              Selective repeatdilemma

                                              Example seq rsquos 0 1 2 3window size=3

                                              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                              Q what is relationship between seq size and window size

                                              3 Transport Layer 57Comp 361 Spring 2005

                                              Chapter 3 outline

                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                              35 Connection-oriented transport TCP

                                              segment structurereliable data transferflow controlconnection management

                                              36 Principles of congestion control37 TCP congestion control

                                              3 Transport Layer 58Comp 361 Spring 2005

                                              TCP Overview RFCs 793 1122 1323 2018 2581

                                              full duplex databi-directional data flow in same connectionMSS maximum segment size

                                              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                              flow controlledsender will not overwhelm receiver

                                              point-to-pointone sender one receiver

                                              reliable in-order byte steam

                                              no ldquomessage boundariesrdquopipelined

                                              TCP congestion and flow control set window size

                                              send amp receive buffers

                                              socketdoor

                                              TCPsend buffer

                                              TCPreceive buffer

                                              socketdoor

                                              segment

                                              applicationwrites data

                                              applicationreads data

                                              3 Transport Layer 59Comp 361 Spring 2005

                                              More TCP DetailsMaximum Segment Size (MSS)

                                              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                              Application Data + TCP Header = TCP Segment

                                              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                              (again no payload)Client responds with third special segment

                                              This can contain payload

                                              3 Transport Layer 60Comp 361 Spring 2005

                                              Even More TCP Details

                                              A TCP connection between client and server creates in both client and server

                                              (i) buffers(ii) variables and

                                              (iii) a socket connection to process

                                              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                              any of the network elements between the host and server

                                              3 Transport Layer 61Comp 361 Spring 2005

                                              TCP segment structure

                                              source port dest port

                                              32 bits

                                              applicationdata

                                              (variable length)

                                              sequence numberacknowledgement number

                                              Receive windowUrg data pnterchecksum

                                              FSRPAUheadlen

                                              notused

                                              Options (variable length)

                                              URG urgent data (generally not used)

                                              ACK ACK valid

                                              PSH push data now(generally not used)

                                              RST SYN FINconnection estab(setup teardown

                                              commands)

                                              bytes rcvr willingto accept

                                              Internetchecksum

                                              (as in UDP)

                                              countingby bytes of data(not segments)

                                              3 Transport Layer 62Comp 361 Spring 2005

                                              TCP seq rsquos and ACKsSeq rsquos

                                              byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                              ACKsseq of next byte expected from other sidecumulative ACK

                                              Q how receiver handles out-of-order segments

                                              A TCP spec doesnrsquot say - up to implementer

                                              Host BHost A

                                              Seq=42 ACK=79 data = lsquoCrsquo

                                              Seq=79 ACK=43 data = lsquoCrsquo

                                              Seq=43 ACK=80

                                              Usertypes

                                              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                              back lsquoCrsquo

                                              host ACKsreceipt

                                              of echoedlsquoCrsquo

                                              timesimple telnet scenario

                                              3 Transport Layer 63Comp 361 Spring 2005

                                              TCP Round Trip Time and Timeout

                                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                              average several recent measurements not just current SampleRTT

                                              Q how to set TCP timeout valuelonger than RTT

                                              but RTT variestoo short premature timeout

                                              unnecessary retransmissions

                                              too long slow reaction to segment loss

                                              3 Transport Layer 64Comp 361 Spring 2005

                                              TCP Round Trip Time and Timeout

                                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                              3 Transport Layer 65Comp 361 Spring 2005

                                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                              100

                                              150

                                              200

                                              250

                                              300

                                              350

                                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                              time (seconnds)

                                              RTT

                                              (mill

                                              iseco

                                              nds)

                                              SampleRTT Estimated RTT

                                              3 Transport Layer 66Comp 361 Spring 2005

                                              TCP Round Trip Time and Timeout

                                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                              (typically β = 025)

                                              Then set timeout interval

                                              TimeoutInterval = EstimatedRTT + 4DevRTT

                                              3 Transport Layer 67Comp 361 Spring 2005

                                              Chapter 3 outline

                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                              35 Connection-oriented transport TCP

                                              segment structurereliable data transferflow controlconnection management

                                              36 Principles of congestion control37 TCP congestion control

                                              3 Transport Layer 68Comp 361 Spring 2005

                                              TCP reliable data transfer

                                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                              Retransmissions are triggered by

                                              timeout eventsduplicate acks

                                              Initially consider simplified TCP sender

                                              ignore duplicate acksignore flow control congestion control

                                              3 Transport Layer 69Comp 361 Spring 2005

                                              TCP sender eventsdata rcvd from app

                                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                              timeoutretransmit segment that caused timeoutrestart timer

                                              Ack rcvdIf acknowledges previously unackedsegments

                                              update what is known to be ackedstart timer if there are outstanding segments

                                              TCP sender(simplified)

                                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                              loop (forever) switch(event)

                                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                              event timer timeoutretransmit not-yet-acknowledged segment with

                                              smallest sequence numberstart timer

                                              event ACK received with ACK field value of y if (y gt SendBase)

                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                              start timer

                                              end of loop forever

                                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                              3 Transport Layer 70Comp 361 Spring 2005

                                              3 Transport Layer 71Comp 361 Spring 2005

                                              TCP retransmission scenariosHost A

                                              Seq=100 20 bytes data

                                              ACK=100

                                              timepremature timeout

                                              Host B

                                              Seq=92 8 bytes data

                                              ACK=120

                                              Seq=92 8 bytes data

                                              Seq=

                                              92 t

                                              imeo

                                              ut

                                              ACK=120

                                              Host A

                                              Seq=92 8 bytes data

                                              ACK=100

                                              loss

                                              tim

                                              eout

                                              lost ACK scenario

                                              Host B

                                              X

                                              Seq=92 8 bytes data

                                              ACK=100

                                              time

                                              SendBase= 120

                                              SendBase= 120

                                              Sendbase= 100

                                              Seq=

                                              92 t

                                              imeo

                                              utSendBase

                                              = 100

                                              3 Transport Layer 72Comp 361 Spring 2005

                                              TCP retransmission scenarios (more)Host A

                                              Seq=92 8 bytes data

                                              ACK=100

                                              loss

                                              tim

                                              eout

                                              Cumulative ACK scenario

                                              Host B

                                              X

                                              Seq=100 20 bytes data

                                              ACK=120

                                              time

                                              SendBase= 120

                                              3 Transport Layer 73Comp 361 Spring 2005

                                              TCP ACK generation [RFC 1122 RFC 2581]

                                              Event at Receiver

                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                              Arrival of segment that partially or completely fills gap

                                              TCP Receiver action

                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                              Immediately send single cumulative ACK ACKing both in-order segments

                                              Immediately send duplicate ACK indicating seq of next expected byte

                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                              3 Transport Layer 74Comp 361 Spring 2005

                                              More on Sender Policies

                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                              3 Transport Layer 75Comp 361 Spring 2005

                                              Fast Retransmit

                                              Time-out period often relatively long

                                              long delay before resending lost packet

                                              Detect lost segments via duplicate ACKs

                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                              fast retransmit resend segment before timer expires

                                              3 Transport Layer 76Comp 361 Spring 2005

                                              Fast retransmit algorithm

                                              event ACK received with ACK field value of y if (y gt SendBase)

                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                              start timer

                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                              resend segment with sequence number y

                                              a duplicate ACK for already ACKed segment

                                              fast retransmit

                                              3 Transport Layer 77Comp 361 Spring 2005

                                              TCP GBN or Selective Repeat

                                              Basic TCP looks a lot like GBN

                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                              This looks a lot like Selective Repeat

                                              TCP is a hybrid

                                              3 Transport Layer 78Comp 361 Spring 2005

                                              Chapter 3 outline

                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                              35 Connection-oriented transport TCP

                                              segment structurereliable data transferflow controlconnection management

                                              36 Principles of congestion control37 TCP congestion control

                                              3 Transport Layer 79Comp 361 Spring 2005

                                              TCP Flow Control

                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                              3 Transport Layer 80Comp 361 Spring 2005

                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                              transmitting too muchtoo fast

                                              flow controlreceive side of TCP connection has a receive buffer

                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                              app process may be slow at reading from buffer

                                              3 Transport Layer 81Comp 361 Spring 2005

                                              TCP segment structure

                                              source port dest port

                                              32 bits

                                              applicationdata

                                              (variable length)

                                              sequence numberacknowledgement number

                                              Receive windowUrg data pnterchecksum

                                              FSRPAUheadlen

                                              notused

                                              Options (variable length)

                                              URG urgent data (generally not used)

                                              ACK ACK valid

                                              PSH push data now(generally not used)

                                              RST SYN FINconnection estab(setup teardown

                                              commands)

                                              bytes rcvr willingto accept

                                              Internetchecksum

                                              (as in UDP)

                                              countingby bytes of data(not segments)

                                              3 Transport Layer 82Comp 361 Spring 2005

                                              TCP Flow control how it works

                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                              LastByteRead]

                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                              guarantees receive buffer doesnrsquot overflow

                                              3 Transport Layer 83Comp 361 Spring 2005

                                              Technical Issue

                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                              3 Transport Layer 84Comp 361 Spring 2005

                                              Note on UDP

                                              UDP has no flow control

                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                              3 Transport Layer 85Comp 361 Spring 2005

                                              Chapter 3 outline

                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                              35 Connection-oriented transport TCP

                                              segment structurereliable data transferflow controlconnection management

                                              36 Principles of congestion control37 TCP congestion control

                                              3 Transport Layer 86Comp 361 Spring 2005

                                              TCP Connection Management

                                              Three way handshakeStep 1 client end system sends

                                              TCP SYN control segment to server

                                              specifies client_isn the initial seq No application data

                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                              seq sbuffers flow control info (eg RcvWindow)

                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                              3 Transport Layer 87Comp 361 Spring 2005

                                              TCP Connection Management (cont)

                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                              Allocate buffersAllocates buffersCan include application data

                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                              clientConnection request (SYN=1 seq=client_isn)

                                              server

                                              Connection granted (SYN=1 server_isn

                                              ACK (SYN=0 seq=client_isn+1)

                                              ack=client_isn+1)

                                              ack=server_isn+1

                                              3 Transport Layer 88Comp 361 Spring 2005

                                              TCP Connection Management (cont)

                                              Closing a connection

                                              client closes socketclientSocketclose()

                                              Step 1 client end system sends TCP FIN control segment to server

                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                              client

                                              FIN

                                              server

                                              ACK

                                              ACK

                                              FIN

                                              close

                                              close

                                              closed

                                              tim

                                              ed w

                                              ait

                                              3 Transport Layer 89Comp 361 Spring 2005

                                              TCP Connection Management (cont)

                                              Step 3 client receives FIN replies with ACK

                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                              Closes down after timed-wait

                                              Step 4 server receives ACK Connection closed

                                              Note with small modification can handle simultaneous FINs

                                              client

                                              FIN

                                              server

                                              ACK

                                              ACK

                                              FIN

                                              closing

                                              closing

                                              closed

                                              tim

                                              ed w

                                              ait

                                              closed

                                              3 Transport Layer 90Comp 361 Spring 2005

                                              TCP Connection Management (cont)

                                              ExampleTCP serverlifecycle

                                              Example TCP clientlifecycle

                                              3 Transport Layer 91Comp 361 Spring 2005

                                              A few special cases

                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                              3 Transport Layer 92Comp 361 Spring 2005

                                              Chapter 3 outline

                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                              35 Connection-oriented transport TCP

                                              segment structurereliable data transferflow controlconnection management

                                              36 Principles of congestion control37 TCP congestion control

                                              3 Transport Layer 93Comp 361 Spring 2005

                                              Principles of Congestion Control

                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                              a top-10 problem

                                              3 Transport Layer 94Comp 361 Spring 2005

                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                              large delays when congestedmaximum achievable throughput

                                              3 Transport Layer 95Comp 361 Spring 2005

                                              Causescosts of congestion scenario 2

                                              one router finite buffers sender retransmission of lost packet

                                              3 Transport Layer 96Comp 361 Spring 2005

                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                              λin λout=

                                              λin λoutgtλ

                                              inλout

                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                              (c)(a) (b)

                                              3 Transport Layer 97Comp 361 Spring 2005

                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                              λin

                                              Q what happens as and increase λ

                                              in

                                              3 Transport Layer 98Comp 361 Spring 2005

                                              Causescosts of congestion scenario 3

                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                              3 Transport Layer 99Comp 361 Spring 2005

                                              Approaches towards congestion control

                                              Two broad approaches towards congestion control

                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                              Network-assisted congestion controlrouters provide feedback to end systems

                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                              3 Transport Layer 100Comp 361 Spring 2005

                                              Case study ATM ABR congestion control

                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                              RM cells returned to sender by receiver with bits intact

                                              small exception ndash see next page

                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                              sender should use available bandwidth

                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                              3 Transport Layer 101Comp 361 Spring 2005

                                              Case study ATM ABR congestion control

                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                              3 Transport Layer 102Comp 361 Spring 2005

                                              Chapter 3 outline

                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                              35 Connection-oriented transport TCP

                                              segment structurereliable data transferflow controlconnection management

                                              36 Principles of congestion control37 TCP congestion control

                                              3 Transport Layer 103Comp 361 Spring 2005

                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                              Congwin

                                              w segments each with MSS bytes sent in one RTT

                                              throughput = w MSSRTT Bytessec

                                              3 Transport Layer 104Comp 361 Spring 2005

                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                              LastByteSent-LastByteAcked le CongWin

                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                              3 Transport Layer 105Comp 361 Spring 2005

                                              TCP AIMDmultiplicative decrease additive increase increase

                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                              cut CongWin in half after loss event

                                              8 Kbytes

                                              16 Kbytes

                                              24 Kbytes

                                              time

                                              congestionwindow

                                              Long-lived TCP connection

                                              3 Transport Layer 106Comp 361 Spring 2005

                                              TCP Slow Start

                                              When connection begins CongWin = 1 MSS

                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                              available bandwidth may be gtgt MSSRTT

                                              desirable to quickly ramp up to respectable rate

                                              When connection begins increase rate exponentially fast until first loss event

                                              3 Transport Layer 107Comp 361 Spring 2005

                                              TCP Slow Start (more)

                                              When connection begins increase rate exponentially until first loss event

                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                              Summary initial rate is slow but ramps up exponentially fast

                                              Host A

                                              one segment

                                              RTT

                                              Host B

                                              time

                                              two segments

                                              four segments

                                              3 Transport Layer 108Comp 361 Spring 2005

                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                              3 Transport Layer 109Comp 361 Spring 2005

                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                              3 Transport Layer 110Comp 361 Spring 2005

                                              Summary TCP Congestion Control

                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                              3 Transport Layer 111Comp 361 Spring 2005

                                              The Big Picture

                                              3 Transport Layer 112Comp 361 Spring 2005

                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                              ACK receipt for previously unackeddata

                                              Slow Start (SS)

                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                              set state to ldquoCongestion Avoidancerdquo

                                              Resulting in a doubling of CongWin every RTT

                                              ACK receipt for previously unackeddata

                                              CongestionAvoidance (CA)

                                              CongWin = CongWin+MSS (MSSCongWin)

                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                              Loss event detected by triple duplicate ACK

                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                              Enter slow start

                                              Duplicate ACK

                                              SS or CA Increment duplicate ACK count for segment being acked

                                              CongWin and Threshold not changed

                                              3 Transport Layer 113Comp 361 Spring 2005

                                              TCP throughput

                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                              3 Transport Layer 114Comp 361 Spring 2005

                                              TCP Futures

                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                              LRTTMSSsdot221

                                              3 Transport Layer 115Comp 361 Spring 2005

                                              TCP FairnessFairness goal if K TCP sessions share same

                                              bottleneck link of bandwidth R each should have average rate of RK

                                              TCP connection 1

                                              bottleneckrouter

                                              capacity R

                                              TCP connection 2

                                              3 Transport Layer 116Comp 361 Spring 2005

                                              Why is TCP fairTwo competing sessions

                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                              R

                                              R

                                              equal bandwidth share

                                              Connection 1 throughput

                                              Conn

                                              ecti

                                              on 2

                                              thr

                                              ough

                                              p ut

                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                              3 Transport Layer 117Comp 361 Spring 2005

                                              Fairness (more)Fairness and UDP

                                              Multimedia apps often do not use TCP

                                              do not want rate throttled by congestion control

                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                              Current Research area How to keep UDP from congesting the internet

                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                              3 Transport Layer 118Comp 361 Spring 2005

                                              TCP Latency ModelingNotation assumptions

                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                              modeling slow start

                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                              3 Transport Layer 119Comp 361 Spring 2005

                                              Fixed Congestion Window (W)Two cases

                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                              3 Transport Layer 120Comp 361 Spring 2005

                                              Fixed congestion window (1)

                                              First caseWSR gt RTT + SR ACK for

                                              first segment in window returns before windowrsquos worth of data sent

                                              latency = 2RTT + OR

                                              3 Transport Layer 121Comp 361 Spring 2005

                                              Fixed congestion window (2)

                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                              3 Transport Layer 122Comp 361 Spring 2005

                                              TCP Latency Modeling Slow Start (1)

                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                              Will show that the delay for one object is

                                              RS

                                              RSRTTP

                                              RORTTLatency P )12(2 minusminus⎥⎦

                                              ⎤⎢⎣⎡ +++=

                                              where P is the number of times TCP idles at server1min minus= KQP

                                              - where Q is the number of times the server idlesif the object were of infinite size

                                              - and K is the number of windows that cover the object

                                              3 Transport Layer 123Comp 361 Spring 2005

                                              TCP Latency Modeling Slow Start (2)

                                              RTT

                                              initiate TCPconnection

                                              requestobject

                                              first window= SR

                                              second window= 2SR

                                              third window= 4SR

                                              fourth window= 8SR

                                              completetransmissionobject

                                              delivered

                                              time atclient

                                              time atserver

                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                              Server idles P=2 times

                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                              Server idles P = minK-1Q times

                                              3 Transport Layer 124Comp 361 Spring 2005

                                              TCP Latency Modeling (3)

                                              ementacknowledg receivesserver until

                                              segment send tostartsserver whenfrom time=+ RTTRS

                                              RS

                                              RSRTTPRTT

                                              RO

                                              RSRTT

                                              RSRTT

                                              RO

                                              idleTimeRTTRO

                                              P

                                              kP

                                              k

                                              P

                                              pp

                                              )12(][2

                                              ]2[2

                                              2delay

                                              1

                                              1

                                              1

                                              minusminus+++=

                                              minus+++=

                                              ++=

                                              minus

                                              =

                                              =

                                              sum

                                              sum

                                              th window after the timeidle 2 1 kRSRTT

                                              RS k =⎥⎦

                                              ⎤⎢⎣⎡ minus+

                                              +minus

                                              window kth the transmit totime2 1 =minus

                                              RSk

                                              RTT

                                              initiate TCPconnection

                                              requestobject

                                              first window= SR

                                              second window= 2SR

                                              third window= 4SR

                                              fourth window= 8SR

                                              completetransmissionobject

                                              delivered

                                              time atclient

                                              time atserver

                                              3 Transport Layer 125Comp 361 Spring 2005

                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                              How do we calculate K

                                              ⎥⎥⎤

                                              ⎢⎢⎡ +=

                                              +ge=

                                              geminus=

                                              ge+++=

                                              ge+++=minus

                                              minus

                                              )1(log

                                              )1(logmin

                                              12min

                                              222min222min

                                              2

                                              2

                                              110

                                              110

                                              SO

                                              SOkk

                                              SOk

                                              SOkOSSSkK

                                              k

                                              k

                                              k

                                              L

                                              L

                                              Calculation of Q number of idles for infinite-size objectis similar

                                              3 Transport Layer 126Comp 361 Spring 2005

                                              HTTP ModelingAssume Web page consists of

                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                              3 Transport Layer 127Comp 361 Spring 2005

                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                              02468

                                              101214161820

                                              28Kbps

                                              100Kbps

                                              1 Mbps 10Mbps

                                              non-persistent

                                              persistent

                                              parallel non-persistent

                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                              3 Transport Layer 128Comp 361 Spring 2005

                                              HTTP Response time (in seconds)

                                              0

                                              10

                                              20

                                              30

                                              40

                                              50

                                              60

                                              70

                                              28Kbps

                                              100Kbps

                                              1 Mbps 10Mbps

                                              non-persistent

                                              persistent

                                              parallel non-persistent

                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                              3 Transport Layer 129Comp 361 Spring 2005

                                              Chapter 3 Summaryprinciples behind transport layer services

                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                              instantiation and implementation in the Internet

                                              UDPTCP

                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                              • Chapter 3 Transport Layer last revised 160305
                                              • Chapter 3 outline
                                              • Transport services and protocols
                                              • Transport vs network layer
                                              • Transport-layer protocols
                                              • Chapter 3 outline
                                              • Multiplexingdemultiplexing
                                              • Multiplexingdemultiplexing
                                              • How demultiplexing works
                                              • Connectionless demultiplexing
                                              • Connectionless demux (cont)
                                              • Connection-oriented demux
                                              • Connection-oriented demux (cont)
                                              • Connection-oriented demux Threaded Web Server
                                              • Chapter 3 outline
                                              • UDP User Datagram Protocol [RFC 768]
                                              • UDP more
                                              • UDP checksum
                                              • Chapter 3 outline
                                              • Principles of Reliable data transfer
                                              • Reliable data transfer getting started
                                              • Reliable data transfer getting started
                                              • Incremental Improvements
                                              • Rdt10 reliable transfer over a reliable channel
                                              • Rdt20 channel with bit errors
                                              • rdt20 FSM specification
                                              • rdt20 operation with no errors
                                              • rdt20 error scenario
                                              • rdt20 has a fatal flaw
                                              • rdt21 sender handles garbled ACKNAKs
                                              • rdt21 receiver handles garbled ACKNAKs
                                              • rdt21 discussion
                                              • rdt22 a NAK-free protocol
                                              • rdt22 sender receiver fragments
                                              • rdt30 channels with errors and loss
                                              • rdt30 sender
                                              • rdt30 in action
                                              • rdt30 in action
                                              • Performance of rdt30
                                              • rdt30 stop-and-wait operation
                                              • Pipelined protocols
                                              • Pipelined protocols
                                              • Pipelining increased utilization
                                              • Go-Back-N
                                              • GBN Sender
                                              • GBN sender extended FSM
                                              • GBN receiver extended FSM
                                              • More on receiver
                                              • GBN inaction
                                              • Selective Repeat
                                              • Selective repeat sender receiver windows
                                              • Selective repeat
                                              • Selective repeat in action
                                              • Selective repeat dilemma
                                              • Chapter 3 outline
                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                              • More TCP Details
                                              • Even More TCP Details
                                              • TCP segment structure
                                              • TCP seq rsquos and ACKs
                                              • TCP Round Trip Time and Timeout
                                              • TCP Round Trip Time and Timeout
                                              • Example RTT estimation
                                              • TCP Round Trip Time and Timeout
                                              • Chapter 3 outline
                                              • TCP reliable data transfer
                                              • TCP sender events
                                              • TCP sender(simplified)
                                              • TCP retransmission scenarios
                                              • TCP retransmission scenarios (more)
                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                              • More on Sender Policies
                                              • Fast Retransmit
                                              • Fast retransmit algorithm
                                              • TCP GBN or Selective Repeat
                                              • Chapter 3 outline
                                              • TCP Flow Control
                                              • TCP Flow Control
                                              • TCP segment structure
                                              • TCP Flow control how it works
                                              • Technical Issue
                                              • Chapter 3 outline
                                              • TCP Connection Management
                                              • TCP Connection Management (cont)
                                              • TCP Connection Management (cont)
                                              • TCP Connection Management (cont)
                                              • TCP Connection Management (cont)
                                              • A few special cases
                                              • Chapter 3 outline
                                              • Principles of Congestion Control
                                              • Causescosts of congestion scenario 1
                                              • Causescosts of congestion scenario 2
                                              • Causescosts of congestion scenario 3
                                              • Causescosts of congestion scenario 3
                                              • Approaches towards congestion control
                                              • Case study ATM ABR congestion control
                                              • Case study ATM ABR congestion control
                                              • Chapter 3 outline
                                              • TCP Congestion Control
                                              • TCP AIMD
                                              • TCP Slow Start
                                              • TCP Slow Start (more)
                                              • Summary TCP Congestion Control
                                              • The Big Picture
                                              • TCP sender congestion control
                                              • TCP throughput
                                              • TCP Futures
                                              • TCP Fairness
                                              • Why is TCP fair
                                              • Fairness (more)
                                              • TCP Latency Modeling
                                              • Fixed Congestion Window (W)
                                              • Fixed congestion window (1)
                                              • Fixed congestion window (2)
                                              • TCP Latency Modeling Slow Start (1)
                                              • TCP Latency Modeling Slow Start (2)
                                              • TCP Latency Modeling (3)
                                              • TCP Latency Modeling (4)
                                              • HTTP Modeling
                                              • Chapter 3 Summary

                                                Rdt10 reliable transfer over a reliable channel

                                                underlying channel perfectly reliableno bit errorsno loss of packets

                                                separate FSMs for sender receiversender sends data into underlying channelreceiver read data from underlying channel

                                                Wait for call from above packet = make_pkt(data)

                                                udt_send(packet)

                                                rdt_send(data)extract (packetdata)deliver_data(data)

                                                Wait for call from

                                                below

                                                rdt_rcv(packet)

                                                sender receiver

                                                3 Transport Layer 24Comp 361 Spring 2005

                                                3 Transport Layer 25Comp 361 Spring 2005

                                                Rdt20 channel with bit errors

                                                underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                                the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                                new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                                3 Transport Layer 26Comp 361 Spring 2005

                                                rdt20 FSM specification

                                                Wait for call from above

                                                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                udt_send(NAK)

                                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                Wait for ACK or

                                                NAK

                                                rdt_send(data)

                                                receiver

                                                Wait for call from

                                                below

                                                Λ

                                                sender

                                                3 Transport Layer 27Comp 361 Spring 2005

                                                rdt20 operation with no errors

                                                Wait for call from above

                                                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                udt_send(NAK)

                                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                Wait for ACK or

                                                NAK

                                                Wait for call from

                                                below

                                                rdt_send(data)

                                                Λ

                                                3 Transport Layer 28Comp 361 Spring 2005

                                                rdt20 error scenario

                                                Wait for call from above

                                                snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                udt_send(NAK)

                                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                Wait for ACK or

                                                NAK

                                                Wait for call from

                                                below

                                                rdt_send(data)

                                                Λ

                                                3 Transport Layer 29Comp 361 Spring 2005

                                                rdt20 has a fatal flawWhat happens if ACKNAK

                                                corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                                What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                                Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                                Sender sends one packet then waits for receiver response

                                                stop and wait

                                                3 Transport Layer 30Comp 361 Spring 2005

                                                Sender whenever sender receives control message it sends a packet to receiver

                                                A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                Note ACKNAK do not contain sequence

                                                3 Transport Layer 31Comp 361 Spring 2005

                                                rdt21 sender handles garbled ACKNAKs

                                                Wait for call 0 from

                                                above

                                                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                rdt_send(data)

                                                Wait for ACK or NAK 0 udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                rdt_send(data)

                                                udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                Wait forcall 1 from

                                                above

                                                Wait for ACK or NAK 1

                                                ΛΛ

                                                3 Transport Layer 32Comp 361 Spring 2005

                                                rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                ampamp has_seq0(rcvpkt)

                                                Wait for 0 from below

                                                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                Wait for 1 from below

                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                3 Transport Layer 33Comp 361 Spring 2005

                                                rdt21 discussion

                                                Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                Receivermust check if received packet is duplicate

                                                state indicates whether 0 or 1 is expected pkt seq

                                                note receiver can notknow if its last ACKNAK received OK at sender

                                                3 Transport Layer 34Comp 361 Spring 2005

                                                rdt22 a NAK-free protocol

                                                same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                3 Transport Layer 35Comp 361 Spring 2005

                                                rdt22 sender receiver fragments

                                                Wait for call 0 from

                                                above

                                                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                rdt_send(data)

                                                udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                isACK(rcvpkt1) )

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                Wait for ACK

                                                0sender FSM

                                                fragment

                                                Wait for 0 from below

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                has_seq1(rcvpkt))

                                                udt_send(sndpkt)receiver FSM

                                                fragment

                                                Λ

                                                3 Transport Layer 36Comp 361 Spring 2005

                                                rdt30 channels with errors and loss

                                                New assumptionunderlying channel can also lose packets (data or ACKs)

                                                checksum seq ACKs retransmissions will be of help but not enough

                                                Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                requires countdown timer

                                                3 Transport Layer 37Comp 361 Spring 2005

                                                rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                rdt_send(data)

                                                Wait for

                                                ACK0

                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                Wait for call 1 from

                                                above

                                                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                rdt_send(data)

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                stop_timerstop_timer

                                                udt_send(sndpkt)start_timer

                                                timeout

                                                udt_send(sndpkt)start_timer

                                                timeout

                                                rdt_rcv(rcvpkt)

                                                Wait for call 0from

                                                above

                                                Wait for

                                                ACK1

                                                Λrdt_rcv(rcvpkt)

                                                ΛΛ

                                                Λ

                                                3 Transport Layer 38Comp 361 Spring 2005

                                                rdt30 in action

                                                3 Transport Layer 39Comp 361 Spring 2005

                                                rdt30 in action

                                                3 Transport Layer 40Comp 361 Spring 2005

                                                Performance of rdt30

                                                rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                L (packet length in bits)R (transmission rate bps)

                                                8kbpkt109 bsec

                                                Ttransmit = = = 8 microsec

                                                U sender =

                                                00830008

                                                = 000027 L R RTT + L R

                                                =

                                                U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                rdt30 stop-and-wait operation

                                                first packet bit transmitted t = 0

                                                sender receiver

                                                RTT

                                                last packet bit transmitted t = L R

                                                first packet bit arriveslast packet bit arrives send ACK

                                                ACK arrives send next packet t = RTT + L R

                                                U sender =

                                                008 30008

                                                = 000027 L R RTT + L R

                                                =

                                                3 Transport Layer 41Comp 361 Spring 2005

                                                3 Transport Layer 42Comp 361 Spring 2005

                                                Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                range of sequence numbers must be increasedbuffering at sender andor receiver

                                                3 Transport Layer 43Comp 361 Spring 2005

                                                Pipelined protocols

                                                Advantage much better bandwidth utilization than stop-and-wait

                                                Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                Note TCP is not exactly either

                                                Pipelining increased utilization

                                                first packet bit transmitted t = 0

                                                sender receiver

                                                RTT

                                                last bit transmitted t = L R

                                                first packet bit arriveslast packet bit arrives send ACK

                                                ACK arrives send next packet t = RTT + L R

                                                last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                U sender =

                                                02430008

                                                = 00008 3 L R RTT + L R

                                                =

                                                Increase utilizationby a factor of 3

                                                3 Transport Layer 44Comp 361 Spring 2005

                                                3 Transport Layer 45Comp 361 Spring 2005

                                                Go-Back-NSender

                                                k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                3 Transport Layer 46Comp 361 Spring 2005

                                                GBN Sender

                                                rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                Timeout resends ALL packets that have been sent but not yet acknowledged

                                                This is only event that triggers resend

                                                3 Transport Layer 47Comp 361 Spring 2005

                                                GBN sender extended FSMrdt_send(data)

                                                Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                timeout

                                                if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                start_timernextseqnum++

                                                elserefuse_data(data)

                                                base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                stop_timerelse

                                                start_timer

                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                base=1nextseqnum=1

                                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                Λ

                                                3 Transport Layer 48Comp 361 Spring 2005

                                                GBN receiver extended FSM

                                                Wait

                                                udt_send(sndpkt)default

                                                rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                expectedseqnum=1sndpkt =

                                                make_pkt(0ACKchksum)

                                                Λ

                                                If expected packet receivedSend ACK and deliver packet upstairs

                                                If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                3 Transport Layer 49Comp 361 Spring 2005

                                                More on receiver

                                                The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                3 Transport Layer 50Comp 361 Spring 2005

                                                GBN inaction

                                                GBN is easy to code but might have performance problems

                                                In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                3 Transport Layer 51Comp 361 Spring 2005

                                                3 Transport Layer 52Comp 361 Spring 2005

                                                Selective Repeat

                                                receiver individually acknowledges all correctly received pkts

                                                buffers pkts as needed for eventual in-order delivery to upper layer

                                                sender only resends pkts for which ACK not received

                                                sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                3 Transport Layer 53Comp 361 Spring 2005

                                                Selective repeat sender receiver windows

                                                3 Transport Layer 54Comp 361 Spring 2005

                                                Selective repeat

                                                pkt n in [rcvbase rcvbase+N-1]

                                                send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                pkt n in [rcvbase-Nrcvbase-1]

                                                ACK(n) (note this is a reACK)

                                                otherwiseignore

                                                receiverdata from above

                                                if next available seq in window send pkt

                                                timeout(n)resend pkt n restart timer

                                                ACK(n) in [sendbasesendbase+N]

                                                mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                sender

                                                3 Transport Layer 55Comp 361 Spring 2005

                                                Selective repeat in action

                                                3 Transport Layer 56Comp 361 Spring 2005

                                                Selective repeatdilemma

                                                Example seq rsquos 0 1 2 3window size=3

                                                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                Q what is relationship between seq size and window size

                                                3 Transport Layer 57Comp 361 Spring 2005

                                                Chapter 3 outline

                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                35 Connection-oriented transport TCP

                                                segment structurereliable data transferflow controlconnection management

                                                36 Principles of congestion control37 TCP congestion control

                                                3 Transport Layer 58Comp 361 Spring 2005

                                                TCP Overview RFCs 793 1122 1323 2018 2581

                                                full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                flow controlledsender will not overwhelm receiver

                                                point-to-pointone sender one receiver

                                                reliable in-order byte steam

                                                no ldquomessage boundariesrdquopipelined

                                                TCP congestion and flow control set window size

                                                send amp receive buffers

                                                socketdoor

                                                TCPsend buffer

                                                TCPreceive buffer

                                                socketdoor

                                                segment

                                                applicationwrites data

                                                applicationreads data

                                                3 Transport Layer 59Comp 361 Spring 2005

                                                More TCP DetailsMaximum Segment Size (MSS)

                                                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                Application Data + TCP Header = TCP Segment

                                                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                (again no payload)Client responds with third special segment

                                                This can contain payload

                                                3 Transport Layer 60Comp 361 Spring 2005

                                                Even More TCP Details

                                                A TCP connection between client and server creates in both client and server

                                                (i) buffers(ii) variables and

                                                (iii) a socket connection to process

                                                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                any of the network elements between the host and server

                                                3 Transport Layer 61Comp 361 Spring 2005

                                                TCP segment structure

                                                source port dest port

                                                32 bits

                                                applicationdata

                                                (variable length)

                                                sequence numberacknowledgement number

                                                Receive windowUrg data pnterchecksum

                                                FSRPAUheadlen

                                                notused

                                                Options (variable length)

                                                URG urgent data (generally not used)

                                                ACK ACK valid

                                                PSH push data now(generally not used)

                                                RST SYN FINconnection estab(setup teardown

                                                commands)

                                                bytes rcvr willingto accept

                                                Internetchecksum

                                                (as in UDP)

                                                countingby bytes of data(not segments)

                                                3 Transport Layer 62Comp 361 Spring 2005

                                                TCP seq rsquos and ACKsSeq rsquos

                                                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                ACKsseq of next byte expected from other sidecumulative ACK

                                                Q how receiver handles out-of-order segments

                                                A TCP spec doesnrsquot say - up to implementer

                                                Host BHost A

                                                Seq=42 ACK=79 data = lsquoCrsquo

                                                Seq=79 ACK=43 data = lsquoCrsquo

                                                Seq=43 ACK=80

                                                Usertypes

                                                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                back lsquoCrsquo

                                                host ACKsreceipt

                                                of echoedlsquoCrsquo

                                                timesimple telnet scenario

                                                3 Transport Layer 63Comp 361 Spring 2005

                                                TCP Round Trip Time and Timeout

                                                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                average several recent measurements not just current SampleRTT

                                                Q how to set TCP timeout valuelonger than RTT

                                                but RTT variestoo short premature timeout

                                                unnecessary retransmissions

                                                too long slow reaction to segment loss

                                                3 Transport Layer 64Comp 361 Spring 2005

                                                TCP Round Trip Time and Timeout

                                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                3 Transport Layer 65Comp 361 Spring 2005

                                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                100

                                                150

                                                200

                                                250

                                                300

                                                350

                                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                time (seconnds)

                                                RTT

                                                (mill

                                                iseco

                                                nds)

                                                SampleRTT Estimated RTT

                                                3 Transport Layer 66Comp 361 Spring 2005

                                                TCP Round Trip Time and Timeout

                                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                (typically β = 025)

                                                Then set timeout interval

                                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                                3 Transport Layer 67Comp 361 Spring 2005

                                                Chapter 3 outline

                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                35 Connection-oriented transport TCP

                                                segment structurereliable data transferflow controlconnection management

                                                36 Principles of congestion control37 TCP congestion control

                                                3 Transport Layer 68Comp 361 Spring 2005

                                                TCP reliable data transfer

                                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                Retransmissions are triggered by

                                                timeout eventsduplicate acks

                                                Initially consider simplified TCP sender

                                                ignore duplicate acksignore flow control congestion control

                                                3 Transport Layer 69Comp 361 Spring 2005

                                                TCP sender eventsdata rcvd from app

                                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                timeoutretransmit segment that caused timeoutrestart timer

                                                Ack rcvdIf acknowledges previously unackedsegments

                                                update what is known to be ackedstart timer if there are outstanding segments

                                                TCP sender(simplified)

                                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                loop (forever) switch(event)

                                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                event timer timeoutretransmit not-yet-acknowledged segment with

                                                smallest sequence numberstart timer

                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                start timer

                                                end of loop forever

                                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                3 Transport Layer 70Comp 361 Spring 2005

                                                3 Transport Layer 71Comp 361 Spring 2005

                                                TCP retransmission scenariosHost A

                                                Seq=100 20 bytes data

                                                ACK=100

                                                timepremature timeout

                                                Host B

                                                Seq=92 8 bytes data

                                                ACK=120

                                                Seq=92 8 bytes data

                                                Seq=

                                                92 t

                                                imeo

                                                ut

                                                ACK=120

                                                Host A

                                                Seq=92 8 bytes data

                                                ACK=100

                                                loss

                                                tim

                                                eout

                                                lost ACK scenario

                                                Host B

                                                X

                                                Seq=92 8 bytes data

                                                ACK=100

                                                time

                                                SendBase= 120

                                                SendBase= 120

                                                Sendbase= 100

                                                Seq=

                                                92 t

                                                imeo

                                                utSendBase

                                                = 100

                                                3 Transport Layer 72Comp 361 Spring 2005

                                                TCP retransmission scenarios (more)Host A

                                                Seq=92 8 bytes data

                                                ACK=100

                                                loss

                                                tim

                                                eout

                                                Cumulative ACK scenario

                                                Host B

                                                X

                                                Seq=100 20 bytes data

                                                ACK=120

                                                time

                                                SendBase= 120

                                                3 Transport Layer 73Comp 361 Spring 2005

                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                Event at Receiver

                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                Arrival of segment that partially or completely fills gap

                                                TCP Receiver action

                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                3 Transport Layer 74Comp 361 Spring 2005

                                                More on Sender Policies

                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                3 Transport Layer 75Comp 361 Spring 2005

                                                Fast Retransmit

                                                Time-out period often relatively long

                                                long delay before resending lost packet

                                                Detect lost segments via duplicate ACKs

                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                fast retransmit resend segment before timer expires

                                                3 Transport Layer 76Comp 361 Spring 2005

                                                Fast retransmit algorithm

                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                start timer

                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                resend segment with sequence number y

                                                a duplicate ACK for already ACKed segment

                                                fast retransmit

                                                3 Transport Layer 77Comp 361 Spring 2005

                                                TCP GBN or Selective Repeat

                                                Basic TCP looks a lot like GBN

                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                This looks a lot like Selective Repeat

                                                TCP is a hybrid

                                                3 Transport Layer 78Comp 361 Spring 2005

                                                Chapter 3 outline

                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                35 Connection-oriented transport TCP

                                                segment structurereliable data transferflow controlconnection management

                                                36 Principles of congestion control37 TCP congestion control

                                                3 Transport Layer 79Comp 361 Spring 2005

                                                TCP Flow Control

                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                3 Transport Layer 80Comp 361 Spring 2005

                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                transmitting too muchtoo fast

                                                flow controlreceive side of TCP connection has a receive buffer

                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                app process may be slow at reading from buffer

                                                3 Transport Layer 81Comp 361 Spring 2005

                                                TCP segment structure

                                                source port dest port

                                                32 bits

                                                applicationdata

                                                (variable length)

                                                sequence numberacknowledgement number

                                                Receive windowUrg data pnterchecksum

                                                FSRPAUheadlen

                                                notused

                                                Options (variable length)

                                                URG urgent data (generally not used)

                                                ACK ACK valid

                                                PSH push data now(generally not used)

                                                RST SYN FINconnection estab(setup teardown

                                                commands)

                                                bytes rcvr willingto accept

                                                Internetchecksum

                                                (as in UDP)

                                                countingby bytes of data(not segments)

                                                3 Transport Layer 82Comp 361 Spring 2005

                                                TCP Flow control how it works

                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                LastByteRead]

                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                guarantees receive buffer doesnrsquot overflow

                                                3 Transport Layer 83Comp 361 Spring 2005

                                                Technical Issue

                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                3 Transport Layer 84Comp 361 Spring 2005

                                                Note on UDP

                                                UDP has no flow control

                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                3 Transport Layer 85Comp 361 Spring 2005

                                                Chapter 3 outline

                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                35 Connection-oriented transport TCP

                                                segment structurereliable data transferflow controlconnection management

                                                36 Principles of congestion control37 TCP congestion control

                                                3 Transport Layer 86Comp 361 Spring 2005

                                                TCP Connection Management

                                                Three way handshakeStep 1 client end system sends

                                                TCP SYN control segment to server

                                                specifies client_isn the initial seq No application data

                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                seq sbuffers flow control info (eg RcvWindow)

                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                3 Transport Layer 87Comp 361 Spring 2005

                                                TCP Connection Management (cont)

                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                Allocate buffersAllocates buffersCan include application data

                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                clientConnection request (SYN=1 seq=client_isn)

                                                server

                                                Connection granted (SYN=1 server_isn

                                                ACK (SYN=0 seq=client_isn+1)

                                                ack=client_isn+1)

                                                ack=server_isn+1

                                                3 Transport Layer 88Comp 361 Spring 2005

                                                TCP Connection Management (cont)

                                                Closing a connection

                                                client closes socketclientSocketclose()

                                                Step 1 client end system sends TCP FIN control segment to server

                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                client

                                                FIN

                                                server

                                                ACK

                                                ACK

                                                FIN

                                                close

                                                close

                                                closed

                                                tim

                                                ed w

                                                ait

                                                3 Transport Layer 89Comp 361 Spring 2005

                                                TCP Connection Management (cont)

                                                Step 3 client receives FIN replies with ACK

                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                Closes down after timed-wait

                                                Step 4 server receives ACK Connection closed

                                                Note with small modification can handle simultaneous FINs

                                                client

                                                FIN

                                                server

                                                ACK

                                                ACK

                                                FIN

                                                closing

                                                closing

                                                closed

                                                tim

                                                ed w

                                                ait

                                                closed

                                                3 Transport Layer 90Comp 361 Spring 2005

                                                TCP Connection Management (cont)

                                                ExampleTCP serverlifecycle

                                                Example TCP clientlifecycle

                                                3 Transport Layer 91Comp 361 Spring 2005

                                                A few special cases

                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                3 Transport Layer 92Comp 361 Spring 2005

                                                Chapter 3 outline

                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                35 Connection-oriented transport TCP

                                                segment structurereliable data transferflow controlconnection management

                                                36 Principles of congestion control37 TCP congestion control

                                                3 Transport Layer 93Comp 361 Spring 2005

                                                Principles of Congestion Control

                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                a top-10 problem

                                                3 Transport Layer 94Comp 361 Spring 2005

                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                large delays when congestedmaximum achievable throughput

                                                3 Transport Layer 95Comp 361 Spring 2005

                                                Causescosts of congestion scenario 2

                                                one router finite buffers sender retransmission of lost packet

                                                3 Transport Layer 96Comp 361 Spring 2005

                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                λin λout=

                                                λin λoutgtλ

                                                inλout

                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                (c)(a) (b)

                                                3 Transport Layer 97Comp 361 Spring 2005

                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                λin

                                                Q what happens as and increase λ

                                                in

                                                3 Transport Layer 98Comp 361 Spring 2005

                                                Causescosts of congestion scenario 3

                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                3 Transport Layer 99Comp 361 Spring 2005

                                                Approaches towards congestion control

                                                Two broad approaches towards congestion control

                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                3 Transport Layer 100Comp 361 Spring 2005

                                                Case study ATM ABR congestion control

                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                RM cells returned to sender by receiver with bits intact

                                                small exception ndash see next page

                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                sender should use available bandwidth

                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                3 Transport Layer 101Comp 361 Spring 2005

                                                Case study ATM ABR congestion control

                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                3 Transport Layer 102Comp 361 Spring 2005

                                                Chapter 3 outline

                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                35 Connection-oriented transport TCP

                                                segment structurereliable data transferflow controlconnection management

                                                36 Principles of congestion control37 TCP congestion control

                                                3 Transport Layer 103Comp 361 Spring 2005

                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                Congwin

                                                w segments each with MSS bytes sent in one RTT

                                                throughput = w MSSRTT Bytessec

                                                3 Transport Layer 104Comp 361 Spring 2005

                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                LastByteSent-LastByteAcked le CongWin

                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                3 Transport Layer 105Comp 361 Spring 2005

                                                TCP AIMDmultiplicative decrease additive increase increase

                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                cut CongWin in half after loss event

                                                8 Kbytes

                                                16 Kbytes

                                                24 Kbytes

                                                time

                                                congestionwindow

                                                Long-lived TCP connection

                                                3 Transport Layer 106Comp 361 Spring 2005

                                                TCP Slow Start

                                                When connection begins CongWin = 1 MSS

                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                available bandwidth may be gtgt MSSRTT

                                                desirable to quickly ramp up to respectable rate

                                                When connection begins increase rate exponentially fast until first loss event

                                                3 Transport Layer 107Comp 361 Spring 2005

                                                TCP Slow Start (more)

                                                When connection begins increase rate exponentially until first loss event

                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                Summary initial rate is slow but ramps up exponentially fast

                                                Host A

                                                one segment

                                                RTT

                                                Host B

                                                time

                                                two segments

                                                four segments

                                                3 Transport Layer 108Comp 361 Spring 2005

                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                3 Transport Layer 109Comp 361 Spring 2005

                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                3 Transport Layer 110Comp 361 Spring 2005

                                                Summary TCP Congestion Control

                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                3 Transport Layer 111Comp 361 Spring 2005

                                                The Big Picture

                                                3 Transport Layer 112Comp 361 Spring 2005

                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                ACK receipt for previously unackeddata

                                                Slow Start (SS)

                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                set state to ldquoCongestion Avoidancerdquo

                                                Resulting in a doubling of CongWin every RTT

                                                ACK receipt for previously unackeddata

                                                CongestionAvoidance (CA)

                                                CongWin = CongWin+MSS (MSSCongWin)

                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                Loss event detected by triple duplicate ACK

                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                Enter slow start

                                                Duplicate ACK

                                                SS or CA Increment duplicate ACK count for segment being acked

                                                CongWin and Threshold not changed

                                                3 Transport Layer 113Comp 361 Spring 2005

                                                TCP throughput

                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                3 Transport Layer 114Comp 361 Spring 2005

                                                TCP Futures

                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                LRTTMSSsdot221

                                                3 Transport Layer 115Comp 361 Spring 2005

                                                TCP FairnessFairness goal if K TCP sessions share same

                                                bottleneck link of bandwidth R each should have average rate of RK

                                                TCP connection 1

                                                bottleneckrouter

                                                capacity R

                                                TCP connection 2

                                                3 Transport Layer 116Comp 361 Spring 2005

                                                Why is TCP fairTwo competing sessions

                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                R

                                                R

                                                equal bandwidth share

                                                Connection 1 throughput

                                                Conn

                                                ecti

                                                on 2

                                                thr

                                                ough

                                                p ut

                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                3 Transport Layer 117Comp 361 Spring 2005

                                                Fairness (more)Fairness and UDP

                                                Multimedia apps often do not use TCP

                                                do not want rate throttled by congestion control

                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                Current Research area How to keep UDP from congesting the internet

                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                3 Transport Layer 118Comp 361 Spring 2005

                                                TCP Latency ModelingNotation assumptions

                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                modeling slow start

                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                3 Transport Layer 119Comp 361 Spring 2005

                                                Fixed Congestion Window (W)Two cases

                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                3 Transport Layer 120Comp 361 Spring 2005

                                                Fixed congestion window (1)

                                                First caseWSR gt RTT + SR ACK for

                                                first segment in window returns before windowrsquos worth of data sent

                                                latency = 2RTT + OR

                                                3 Transport Layer 121Comp 361 Spring 2005

                                                Fixed congestion window (2)

                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                3 Transport Layer 122Comp 361 Spring 2005

                                                TCP Latency Modeling Slow Start (1)

                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                Will show that the delay for one object is

                                                RS

                                                RSRTTP

                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                ⎤⎢⎣⎡ +++=

                                                where P is the number of times TCP idles at server1min minus= KQP

                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                - and K is the number of windows that cover the object

                                                3 Transport Layer 123Comp 361 Spring 2005

                                                TCP Latency Modeling Slow Start (2)

                                                RTT

                                                initiate TCPconnection

                                                requestobject

                                                first window= SR

                                                second window= 2SR

                                                third window= 4SR

                                                fourth window= 8SR

                                                completetransmissionobject

                                                delivered

                                                time atclient

                                                time atserver

                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                Server idles P=2 times

                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                Server idles P = minK-1Q times

                                                3 Transport Layer 124Comp 361 Spring 2005

                                                TCP Latency Modeling (3)

                                                ementacknowledg receivesserver until

                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                RS

                                                RSRTTPRTT

                                                RO

                                                RSRTT

                                                RSRTT

                                                RO

                                                idleTimeRTTRO

                                                P

                                                kP

                                                k

                                                P

                                                pp

                                                )12(][2

                                                ]2[2

                                                2delay

                                                1

                                                1

                                                1

                                                minusminus+++=

                                                minus+++=

                                                ++=

                                                minus

                                                =

                                                =

                                                sum

                                                sum

                                                th window after the timeidle 2 1 kRSRTT

                                                RS k =⎥⎦

                                                ⎤⎢⎣⎡ minus+

                                                +minus

                                                window kth the transmit totime2 1 =minus

                                                RSk

                                                RTT

                                                initiate TCPconnection

                                                requestobject

                                                first window= SR

                                                second window= 2SR

                                                third window= 4SR

                                                fourth window= 8SR

                                                completetransmissionobject

                                                delivered

                                                time atclient

                                                time atserver

                                                3 Transport Layer 125Comp 361 Spring 2005

                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                How do we calculate K

                                                ⎥⎥⎤

                                                ⎢⎢⎡ +=

                                                +ge=

                                                geminus=

                                                ge+++=

                                                ge+++=minus

                                                minus

                                                )1(log

                                                )1(logmin

                                                12min

                                                222min222min

                                                2

                                                2

                                                110

                                                110

                                                SO

                                                SOkk

                                                SOk

                                                SOkOSSSkK

                                                k

                                                k

                                                k

                                                L

                                                L

                                                Calculation of Q number of idles for infinite-size objectis similar

                                                3 Transport Layer 126Comp 361 Spring 2005

                                                HTTP ModelingAssume Web page consists of

                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                3 Transport Layer 127Comp 361 Spring 2005

                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                02468

                                                101214161820

                                                28Kbps

                                                100Kbps

                                                1 Mbps 10Mbps

                                                non-persistent

                                                persistent

                                                parallel non-persistent

                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                3 Transport Layer 128Comp 361 Spring 2005

                                                HTTP Response time (in seconds)

                                                0

                                                10

                                                20

                                                30

                                                40

                                                50

                                                60

                                                70

                                                28Kbps

                                                100Kbps

                                                1 Mbps 10Mbps

                                                non-persistent

                                                persistent

                                                parallel non-persistent

                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                3 Transport Layer 129Comp 361 Spring 2005

                                                Chapter 3 Summaryprinciples behind transport layer services

                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                instantiation and implementation in the Internet

                                                UDPTCP

                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                • Chapter 3 Transport Layer last revised 160305
                                                • Chapter 3 outline
                                                • Transport services and protocols
                                                • Transport vs network layer
                                                • Transport-layer protocols
                                                • Chapter 3 outline
                                                • Multiplexingdemultiplexing
                                                • Multiplexingdemultiplexing
                                                • How demultiplexing works
                                                • Connectionless demultiplexing
                                                • Connectionless demux (cont)
                                                • Connection-oriented demux
                                                • Connection-oriented demux (cont)
                                                • Connection-oriented demux Threaded Web Server
                                                • Chapter 3 outline
                                                • UDP User Datagram Protocol [RFC 768]
                                                • UDP more
                                                • UDP checksum
                                                • Chapter 3 outline
                                                • Principles of Reliable data transfer
                                                • Reliable data transfer getting started
                                                • Reliable data transfer getting started
                                                • Incremental Improvements
                                                • Rdt10 reliable transfer over a reliable channel
                                                • Rdt20 channel with bit errors
                                                • rdt20 FSM specification
                                                • rdt20 operation with no errors
                                                • rdt20 error scenario
                                                • rdt20 has a fatal flaw
                                                • rdt21 sender handles garbled ACKNAKs
                                                • rdt21 receiver handles garbled ACKNAKs
                                                • rdt21 discussion
                                                • rdt22 a NAK-free protocol
                                                • rdt22 sender receiver fragments
                                                • rdt30 channels with errors and loss
                                                • rdt30 sender
                                                • rdt30 in action
                                                • rdt30 in action
                                                • Performance of rdt30
                                                • rdt30 stop-and-wait operation
                                                • Pipelined protocols
                                                • Pipelined protocols
                                                • Pipelining increased utilization
                                                • Go-Back-N
                                                • GBN Sender
                                                • GBN sender extended FSM
                                                • GBN receiver extended FSM
                                                • More on receiver
                                                • GBN inaction
                                                • Selective Repeat
                                                • Selective repeat sender receiver windows
                                                • Selective repeat
                                                • Selective repeat in action
                                                • Selective repeat dilemma
                                                • Chapter 3 outline
                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                • More TCP Details
                                                • Even More TCP Details
                                                • TCP segment structure
                                                • TCP seq rsquos and ACKs
                                                • TCP Round Trip Time and Timeout
                                                • TCP Round Trip Time and Timeout
                                                • Example RTT estimation
                                                • TCP Round Trip Time and Timeout
                                                • Chapter 3 outline
                                                • TCP reliable data transfer
                                                • TCP sender events
                                                • TCP sender(simplified)
                                                • TCP retransmission scenarios
                                                • TCP retransmission scenarios (more)
                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                • More on Sender Policies
                                                • Fast Retransmit
                                                • Fast retransmit algorithm
                                                • TCP GBN or Selective Repeat
                                                • Chapter 3 outline
                                                • TCP Flow Control
                                                • TCP Flow Control
                                                • TCP segment structure
                                                • TCP Flow control how it works
                                                • Technical Issue
                                                • Chapter 3 outline
                                                • TCP Connection Management
                                                • TCP Connection Management (cont)
                                                • TCP Connection Management (cont)
                                                • TCP Connection Management (cont)
                                                • TCP Connection Management (cont)
                                                • A few special cases
                                                • Chapter 3 outline
                                                • Principles of Congestion Control
                                                • Causescosts of congestion scenario 1
                                                • Causescosts of congestion scenario 2
                                                • Causescosts of congestion scenario 3
                                                • Causescosts of congestion scenario 3
                                                • Approaches towards congestion control
                                                • Case study ATM ABR congestion control
                                                • Case study ATM ABR congestion control
                                                • Chapter 3 outline
                                                • TCP Congestion Control
                                                • TCP AIMD
                                                • TCP Slow Start
                                                • TCP Slow Start (more)
                                                • Summary TCP Congestion Control
                                                • The Big Picture
                                                • TCP sender congestion control
                                                • TCP throughput
                                                • TCP Futures
                                                • TCP Fairness
                                                • Why is TCP fair
                                                • Fairness (more)
                                                • TCP Latency Modeling
                                                • Fixed Congestion Window (W)
                                                • Fixed congestion window (1)
                                                • Fixed congestion window (2)
                                                • TCP Latency Modeling Slow Start (1)
                                                • TCP Latency Modeling Slow Start (2)
                                                • TCP Latency Modeling (3)
                                                • TCP Latency Modeling (4)
                                                • HTTP Modeling
                                                • Chapter 3 Summary

                                                  3 Transport Layer 25Comp 361 Spring 2005

                                                  Rdt20 channel with bit errors

                                                  underlying channel may flip bits in packetrecall UDP checksum to detect bit errors

                                                  the question how to recover from errorsacknowledgements (ACKs) receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs) receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKhuman scenarios using ACKs NAKs

                                                  new mechanisms in rdt20 (beyond rdt10)error detectionreceiver feedback control msgs (ACKNAK) rcvr-gtsender

                                                  3 Transport Layer 26Comp 361 Spring 2005

                                                  rdt20 FSM specification

                                                  Wait for call from above

                                                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                  udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                  udt_send(NAK)

                                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                  Wait for ACK or

                                                  NAK

                                                  rdt_send(data)

                                                  receiver

                                                  Wait for call from

                                                  below

                                                  Λ

                                                  sender

                                                  3 Transport Layer 27Comp 361 Spring 2005

                                                  rdt20 operation with no errors

                                                  Wait for call from above

                                                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                  udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                  udt_send(NAK)

                                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                  Wait for ACK or

                                                  NAK

                                                  Wait for call from

                                                  below

                                                  rdt_send(data)

                                                  Λ

                                                  3 Transport Layer 28Comp 361 Spring 2005

                                                  rdt20 error scenario

                                                  Wait for call from above

                                                  snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                  extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                  rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                  udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                  udt_send(NAK)

                                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                  Wait for ACK or

                                                  NAK

                                                  Wait for call from

                                                  below

                                                  rdt_send(data)

                                                  Λ

                                                  3 Transport Layer 29Comp 361 Spring 2005

                                                  rdt20 has a fatal flawWhat happens if ACKNAK

                                                  corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                                  What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                                  Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                                  Sender sends one packet then waits for receiver response

                                                  stop and wait

                                                  3 Transport Layer 30Comp 361 Spring 2005

                                                  Sender whenever sender receives control message it sends a packet to receiver

                                                  A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                  Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                  Note ACKNAK do not contain sequence

                                                  3 Transport Layer 31Comp 361 Spring 2005

                                                  rdt21 sender handles garbled ACKNAKs

                                                  Wait for call 0 from

                                                  above

                                                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                  rdt_send(data)

                                                  Wait for ACK or NAK 0 udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                  rdt_send(data)

                                                  udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                  Wait forcall 1 from

                                                  above

                                                  Wait for ACK or NAK 1

                                                  ΛΛ

                                                  3 Transport Layer 32Comp 361 Spring 2005

                                                  rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                  ampamp has_seq0(rcvpkt)

                                                  Wait for 0 from below

                                                  sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                  Wait for 1 from below

                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                  sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                  sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                  sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                  3 Transport Layer 33Comp 361 Spring 2005

                                                  rdt21 discussion

                                                  Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                  state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                  Receivermust check if received packet is duplicate

                                                  state indicates whether 0 or 1 is expected pkt seq

                                                  note receiver can notknow if its last ACKNAK received OK at sender

                                                  3 Transport Layer 34Comp 361 Spring 2005

                                                  rdt22 a NAK-free protocol

                                                  same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                  receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                  duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                  3 Transport Layer 35Comp 361 Spring 2005

                                                  rdt22 sender receiver fragments

                                                  Wait for call 0 from

                                                  above

                                                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                  rdt_send(data)

                                                  udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                  isACK(rcvpkt1) )

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                  Wait for ACK

                                                  0sender FSM

                                                  fragment

                                                  Wait for 0 from below

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                  has_seq1(rcvpkt))

                                                  udt_send(sndpkt)receiver FSM

                                                  fragment

                                                  Λ

                                                  3 Transport Layer 36Comp 361 Spring 2005

                                                  rdt30 channels with errors and loss

                                                  New assumptionunderlying channel can also lose packets (data or ACKs)

                                                  checksum seq ACKs retransmissions will be of help but not enough

                                                  Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                  Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                  retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                  requires countdown timer

                                                  3 Transport Layer 37Comp 361 Spring 2005

                                                  rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                  rdt_send(data)

                                                  Wait for

                                                  ACK0

                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                  Wait for call 1 from

                                                  above

                                                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                  rdt_send(data)

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                  stop_timerstop_timer

                                                  udt_send(sndpkt)start_timer

                                                  timeout

                                                  udt_send(sndpkt)start_timer

                                                  timeout

                                                  rdt_rcv(rcvpkt)

                                                  Wait for call 0from

                                                  above

                                                  Wait for

                                                  ACK1

                                                  Λrdt_rcv(rcvpkt)

                                                  ΛΛ

                                                  Λ

                                                  3 Transport Layer 38Comp 361 Spring 2005

                                                  rdt30 in action

                                                  3 Transport Layer 39Comp 361 Spring 2005

                                                  rdt30 in action

                                                  3 Transport Layer 40Comp 361 Spring 2005

                                                  Performance of rdt30

                                                  rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                  L (packet length in bits)R (transmission rate bps)

                                                  8kbpkt109 bsec

                                                  Ttransmit = = = 8 microsec

                                                  U sender =

                                                  00830008

                                                  = 000027 L R RTT + L R

                                                  =

                                                  U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                  rdt30 stop-and-wait operation

                                                  first packet bit transmitted t = 0

                                                  sender receiver

                                                  RTT

                                                  last packet bit transmitted t = L R

                                                  first packet bit arriveslast packet bit arrives send ACK

                                                  ACK arrives send next packet t = RTT + L R

                                                  U sender =

                                                  008 30008

                                                  = 000027 L R RTT + L R

                                                  =

                                                  3 Transport Layer 41Comp 361 Spring 2005

                                                  3 Transport Layer 42Comp 361 Spring 2005

                                                  Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                  range of sequence numbers must be increasedbuffering at sender andor receiver

                                                  3 Transport Layer 43Comp 361 Spring 2005

                                                  Pipelined protocols

                                                  Advantage much better bandwidth utilization than stop-and-wait

                                                  Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                  Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                  Note TCP is not exactly either

                                                  Pipelining increased utilization

                                                  first packet bit transmitted t = 0

                                                  sender receiver

                                                  RTT

                                                  last bit transmitted t = L R

                                                  first packet bit arriveslast packet bit arrives send ACK

                                                  ACK arrives send next packet t = RTT + L R

                                                  last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                  U sender =

                                                  02430008

                                                  = 00008 3 L R RTT + L R

                                                  =

                                                  Increase utilizationby a factor of 3

                                                  3 Transport Layer 44Comp 361 Spring 2005

                                                  3 Transport Layer 45Comp 361 Spring 2005

                                                  Go-Back-NSender

                                                  k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                  ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                  Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                  3 Transport Layer 46Comp 361 Spring 2005

                                                  GBN Sender

                                                  rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                  Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                  Timeout resends ALL packets that have been sent but not yet acknowledged

                                                  This is only event that triggers resend

                                                  3 Transport Layer 47Comp 361 Spring 2005

                                                  GBN sender extended FSMrdt_send(data)

                                                  Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                  timeout

                                                  if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                  start_timernextseqnum++

                                                  elserefuse_data(data)

                                                  base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                  stop_timerelse

                                                  start_timer

                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                  base=1nextseqnum=1

                                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                  Λ

                                                  3 Transport Layer 48Comp 361 Spring 2005

                                                  GBN receiver extended FSM

                                                  Wait

                                                  udt_send(sndpkt)default

                                                  rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                  expectedseqnum=1sndpkt =

                                                  make_pkt(0ACKchksum)

                                                  Λ

                                                  If expected packet receivedSend ACK and deliver packet upstairs

                                                  If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                  3 Transport Layer 49Comp 361 Spring 2005

                                                  More on receiver

                                                  The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                  3 Transport Layer 50Comp 361 Spring 2005

                                                  GBN inaction

                                                  GBN is easy to code but might have performance problems

                                                  In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                  Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                  3 Transport Layer 51Comp 361 Spring 2005

                                                  3 Transport Layer 52Comp 361 Spring 2005

                                                  Selective Repeat

                                                  receiver individually acknowledges all correctly received pkts

                                                  buffers pkts as needed for eventual in-order delivery to upper layer

                                                  sender only resends pkts for which ACK not received

                                                  sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                  sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                  3 Transport Layer 53Comp 361 Spring 2005

                                                  Selective repeat sender receiver windows

                                                  3 Transport Layer 54Comp 361 Spring 2005

                                                  Selective repeat

                                                  pkt n in [rcvbase rcvbase+N-1]

                                                  send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                  pkt n in [rcvbase-Nrcvbase-1]

                                                  ACK(n) (note this is a reACK)

                                                  otherwiseignore

                                                  receiverdata from above

                                                  if next available seq in window send pkt

                                                  timeout(n)resend pkt n restart timer

                                                  ACK(n) in [sendbasesendbase+N]

                                                  mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                  sender

                                                  3 Transport Layer 55Comp 361 Spring 2005

                                                  Selective repeat in action

                                                  3 Transport Layer 56Comp 361 Spring 2005

                                                  Selective repeatdilemma

                                                  Example seq rsquos 0 1 2 3window size=3

                                                  receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                  Q what is relationship between seq size and window size

                                                  3 Transport Layer 57Comp 361 Spring 2005

                                                  Chapter 3 outline

                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                  35 Connection-oriented transport TCP

                                                  segment structurereliable data transferflow controlconnection management

                                                  36 Principles of congestion control37 TCP congestion control

                                                  3 Transport Layer 58Comp 361 Spring 2005

                                                  TCP Overview RFCs 793 1122 1323 2018 2581

                                                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                  flow controlledsender will not overwhelm receiver

                                                  point-to-pointone sender one receiver

                                                  reliable in-order byte steam

                                                  no ldquomessage boundariesrdquopipelined

                                                  TCP congestion and flow control set window size

                                                  send amp receive buffers

                                                  socketdoor

                                                  TCPsend buffer

                                                  TCPreceive buffer

                                                  socketdoor

                                                  segment

                                                  applicationwrites data

                                                  applicationreads data

                                                  3 Transport Layer 59Comp 361 Spring 2005

                                                  More TCP DetailsMaximum Segment Size (MSS)

                                                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                  Application Data + TCP Header = TCP Segment

                                                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                  (again no payload)Client responds with third special segment

                                                  This can contain payload

                                                  3 Transport Layer 60Comp 361 Spring 2005

                                                  Even More TCP Details

                                                  A TCP connection between client and server creates in both client and server

                                                  (i) buffers(ii) variables and

                                                  (iii) a socket connection to process

                                                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                  any of the network elements between the host and server

                                                  3 Transport Layer 61Comp 361 Spring 2005

                                                  TCP segment structure

                                                  source port dest port

                                                  32 bits

                                                  applicationdata

                                                  (variable length)

                                                  sequence numberacknowledgement number

                                                  Receive windowUrg data pnterchecksum

                                                  FSRPAUheadlen

                                                  notused

                                                  Options (variable length)

                                                  URG urgent data (generally not used)

                                                  ACK ACK valid

                                                  PSH push data now(generally not used)

                                                  RST SYN FINconnection estab(setup teardown

                                                  commands)

                                                  bytes rcvr willingto accept

                                                  Internetchecksum

                                                  (as in UDP)

                                                  countingby bytes of data(not segments)

                                                  3 Transport Layer 62Comp 361 Spring 2005

                                                  TCP seq rsquos and ACKsSeq rsquos

                                                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                  ACKsseq of next byte expected from other sidecumulative ACK

                                                  Q how receiver handles out-of-order segments

                                                  A TCP spec doesnrsquot say - up to implementer

                                                  Host BHost A

                                                  Seq=42 ACK=79 data = lsquoCrsquo

                                                  Seq=79 ACK=43 data = lsquoCrsquo

                                                  Seq=43 ACK=80

                                                  Usertypes

                                                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                  back lsquoCrsquo

                                                  host ACKsreceipt

                                                  of echoedlsquoCrsquo

                                                  timesimple telnet scenario

                                                  3 Transport Layer 63Comp 361 Spring 2005

                                                  TCP Round Trip Time and Timeout

                                                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                  average several recent measurements not just current SampleRTT

                                                  Q how to set TCP timeout valuelonger than RTT

                                                  but RTT variestoo short premature timeout

                                                  unnecessary retransmissions

                                                  too long slow reaction to segment loss

                                                  3 Transport Layer 64Comp 361 Spring 2005

                                                  TCP Round Trip Time and Timeout

                                                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                  3 Transport Layer 65Comp 361 Spring 2005

                                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                  100

                                                  150

                                                  200

                                                  250

                                                  300

                                                  350

                                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                  time (seconnds)

                                                  RTT

                                                  (mill

                                                  iseco

                                                  nds)

                                                  SampleRTT Estimated RTT

                                                  3 Transport Layer 66Comp 361 Spring 2005

                                                  TCP Round Trip Time and Timeout

                                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                  (typically β = 025)

                                                  Then set timeout interval

                                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                                  3 Transport Layer 67Comp 361 Spring 2005

                                                  Chapter 3 outline

                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                  35 Connection-oriented transport TCP

                                                  segment structurereliable data transferflow controlconnection management

                                                  36 Principles of congestion control37 TCP congestion control

                                                  3 Transport Layer 68Comp 361 Spring 2005

                                                  TCP reliable data transfer

                                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                  Retransmissions are triggered by

                                                  timeout eventsduplicate acks

                                                  Initially consider simplified TCP sender

                                                  ignore duplicate acksignore flow control congestion control

                                                  3 Transport Layer 69Comp 361 Spring 2005

                                                  TCP sender eventsdata rcvd from app

                                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                  timeoutretransmit segment that caused timeoutrestart timer

                                                  Ack rcvdIf acknowledges previously unackedsegments

                                                  update what is known to be ackedstart timer if there are outstanding segments

                                                  TCP sender(simplified)

                                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                  loop (forever) switch(event)

                                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                                  smallest sequence numberstart timer

                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                  start timer

                                                  end of loop forever

                                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                  3 Transport Layer 70Comp 361 Spring 2005

                                                  3 Transport Layer 71Comp 361 Spring 2005

                                                  TCP retransmission scenariosHost A

                                                  Seq=100 20 bytes data

                                                  ACK=100

                                                  timepremature timeout

                                                  Host B

                                                  Seq=92 8 bytes data

                                                  ACK=120

                                                  Seq=92 8 bytes data

                                                  Seq=

                                                  92 t

                                                  imeo

                                                  ut

                                                  ACK=120

                                                  Host A

                                                  Seq=92 8 bytes data

                                                  ACK=100

                                                  loss

                                                  tim

                                                  eout

                                                  lost ACK scenario

                                                  Host B

                                                  X

                                                  Seq=92 8 bytes data

                                                  ACK=100

                                                  time

                                                  SendBase= 120

                                                  SendBase= 120

                                                  Sendbase= 100

                                                  Seq=

                                                  92 t

                                                  imeo

                                                  utSendBase

                                                  = 100

                                                  3 Transport Layer 72Comp 361 Spring 2005

                                                  TCP retransmission scenarios (more)Host A

                                                  Seq=92 8 bytes data

                                                  ACK=100

                                                  loss

                                                  tim

                                                  eout

                                                  Cumulative ACK scenario

                                                  Host B

                                                  X

                                                  Seq=100 20 bytes data

                                                  ACK=120

                                                  time

                                                  SendBase= 120

                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                  Event at Receiver

                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                  Arrival of segment that partially or completely fills gap

                                                  TCP Receiver action

                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                  More on Sender Policies

                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                  Fast Retransmit

                                                  Time-out period often relatively long

                                                  long delay before resending lost packet

                                                  Detect lost segments via duplicate ACKs

                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                  fast retransmit resend segment before timer expires

                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                  Fast retransmit algorithm

                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                  start timer

                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                  resend segment with sequence number y

                                                  a duplicate ACK for already ACKed segment

                                                  fast retransmit

                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                  TCP GBN or Selective Repeat

                                                  Basic TCP looks a lot like GBN

                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                  This looks a lot like Selective Repeat

                                                  TCP is a hybrid

                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                  Chapter 3 outline

                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                  35 Connection-oriented transport TCP

                                                  segment structurereliable data transferflow controlconnection management

                                                  36 Principles of congestion control37 TCP congestion control

                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                  TCP Flow Control

                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                  transmitting too muchtoo fast

                                                  flow controlreceive side of TCP connection has a receive buffer

                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                  app process may be slow at reading from buffer

                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                  TCP segment structure

                                                  source port dest port

                                                  32 bits

                                                  applicationdata

                                                  (variable length)

                                                  sequence numberacknowledgement number

                                                  Receive windowUrg data pnterchecksum

                                                  FSRPAUheadlen

                                                  notused

                                                  Options (variable length)

                                                  URG urgent data (generally not used)

                                                  ACK ACK valid

                                                  PSH push data now(generally not used)

                                                  RST SYN FINconnection estab(setup teardown

                                                  commands)

                                                  bytes rcvr willingto accept

                                                  Internetchecksum

                                                  (as in UDP)

                                                  countingby bytes of data(not segments)

                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                  TCP Flow control how it works

                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                  LastByteRead]

                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                  guarantees receive buffer doesnrsquot overflow

                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                  Technical Issue

                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                  Note on UDP

                                                  UDP has no flow control

                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                  Chapter 3 outline

                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                  35 Connection-oriented transport TCP

                                                  segment structurereliable data transferflow controlconnection management

                                                  36 Principles of congestion control37 TCP congestion control

                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                  TCP Connection Management

                                                  Three way handshakeStep 1 client end system sends

                                                  TCP SYN control segment to server

                                                  specifies client_isn the initial seq No application data

                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                  seq sbuffers flow control info (eg RcvWindow)

                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                  TCP Connection Management (cont)

                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                  Allocate buffersAllocates buffersCan include application data

                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                  clientConnection request (SYN=1 seq=client_isn)

                                                  server

                                                  Connection granted (SYN=1 server_isn

                                                  ACK (SYN=0 seq=client_isn+1)

                                                  ack=client_isn+1)

                                                  ack=server_isn+1

                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                  TCP Connection Management (cont)

                                                  Closing a connection

                                                  client closes socketclientSocketclose()

                                                  Step 1 client end system sends TCP FIN control segment to server

                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                  client

                                                  FIN

                                                  server

                                                  ACK

                                                  ACK

                                                  FIN

                                                  close

                                                  close

                                                  closed

                                                  tim

                                                  ed w

                                                  ait

                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                  TCP Connection Management (cont)

                                                  Step 3 client receives FIN replies with ACK

                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                  Closes down after timed-wait

                                                  Step 4 server receives ACK Connection closed

                                                  Note with small modification can handle simultaneous FINs

                                                  client

                                                  FIN

                                                  server

                                                  ACK

                                                  ACK

                                                  FIN

                                                  closing

                                                  closing

                                                  closed

                                                  tim

                                                  ed w

                                                  ait

                                                  closed

                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                  TCP Connection Management (cont)

                                                  ExampleTCP serverlifecycle

                                                  Example TCP clientlifecycle

                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                  A few special cases

                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                  Chapter 3 outline

                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                  35 Connection-oriented transport TCP

                                                  segment structurereliable data transferflow controlconnection management

                                                  36 Principles of congestion control37 TCP congestion control

                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                  Principles of Congestion Control

                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                  a top-10 problem

                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                  large delays when congestedmaximum achievable throughput

                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                  Causescosts of congestion scenario 2

                                                  one router finite buffers sender retransmission of lost packet

                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                  λin λout=

                                                  λin λoutgtλ

                                                  inλout

                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                  (c)(a) (b)

                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                  λin

                                                  Q what happens as and increase λ

                                                  in

                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                  Causescosts of congestion scenario 3

                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                  Approaches towards congestion control

                                                  Two broad approaches towards congestion control

                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                  Case study ATM ABR congestion control

                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                  RM cells returned to sender by receiver with bits intact

                                                  small exception ndash see next page

                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                  sender should use available bandwidth

                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                  Case study ATM ABR congestion control

                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                  Chapter 3 outline

                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                  35 Connection-oriented transport TCP

                                                  segment structurereliable data transferflow controlconnection management

                                                  36 Principles of congestion control37 TCP congestion control

                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                  Congwin

                                                  w segments each with MSS bytes sent in one RTT

                                                  throughput = w MSSRTT Bytessec

                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                  LastByteSent-LastByteAcked le CongWin

                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                  cut CongWin in half after loss event

                                                  8 Kbytes

                                                  16 Kbytes

                                                  24 Kbytes

                                                  time

                                                  congestionwindow

                                                  Long-lived TCP connection

                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                  TCP Slow Start

                                                  When connection begins CongWin = 1 MSS

                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                  available bandwidth may be gtgt MSSRTT

                                                  desirable to quickly ramp up to respectable rate

                                                  When connection begins increase rate exponentially fast until first loss event

                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                  TCP Slow Start (more)

                                                  When connection begins increase rate exponentially until first loss event

                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                  Summary initial rate is slow but ramps up exponentially fast

                                                  Host A

                                                  one segment

                                                  RTT

                                                  Host B

                                                  time

                                                  two segments

                                                  four segments

                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                  Summary TCP Congestion Control

                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                  The Big Picture

                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                  ACK receipt for previously unackeddata

                                                  Slow Start (SS)

                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                  set state to ldquoCongestion Avoidancerdquo

                                                  Resulting in a doubling of CongWin every RTT

                                                  ACK receipt for previously unackeddata

                                                  CongestionAvoidance (CA)

                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                  Loss event detected by triple duplicate ACK

                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                  Enter slow start

                                                  Duplicate ACK

                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                  CongWin and Threshold not changed

                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                  TCP throughput

                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                  TCP Futures

                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                  LRTTMSSsdot221

                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                  TCP connection 1

                                                  bottleneckrouter

                                                  capacity R

                                                  TCP connection 2

                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                  Why is TCP fairTwo competing sessions

                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                  R

                                                  R

                                                  equal bandwidth share

                                                  Connection 1 throughput

                                                  Conn

                                                  ecti

                                                  on 2

                                                  thr

                                                  ough

                                                  p ut

                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                  Fairness (more)Fairness and UDP

                                                  Multimedia apps often do not use TCP

                                                  do not want rate throttled by congestion control

                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                  Current Research area How to keep UDP from congesting the internet

                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                  TCP Latency ModelingNotation assumptions

                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                  modeling slow start

                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                  Fixed Congestion Window (W)Two cases

                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                  Fixed congestion window (1)

                                                  First caseWSR gt RTT + SR ACK for

                                                  first segment in window returns before windowrsquos worth of data sent

                                                  latency = 2RTT + OR

                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                  Fixed congestion window (2)

                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                  TCP Latency Modeling Slow Start (1)

                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                  Will show that the delay for one object is

                                                  RS

                                                  RSRTTP

                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                  ⎤⎢⎣⎡ +++=

                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                  - and K is the number of windows that cover the object

                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                  TCP Latency Modeling Slow Start (2)

                                                  RTT

                                                  initiate TCPconnection

                                                  requestobject

                                                  first window= SR

                                                  second window= 2SR

                                                  third window= 4SR

                                                  fourth window= 8SR

                                                  completetransmissionobject

                                                  delivered

                                                  time atclient

                                                  time atserver

                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                  Server idles P=2 times

                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                  Server idles P = minK-1Q times

                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                  TCP Latency Modeling (3)

                                                  ementacknowledg receivesserver until

                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                  RS

                                                  RSRTTPRTT

                                                  RO

                                                  RSRTT

                                                  RSRTT

                                                  RO

                                                  idleTimeRTTRO

                                                  P

                                                  kP

                                                  k

                                                  P

                                                  pp

                                                  )12(][2

                                                  ]2[2

                                                  2delay

                                                  1

                                                  1

                                                  1

                                                  minusminus+++=

                                                  minus+++=

                                                  ++=

                                                  minus

                                                  =

                                                  =

                                                  sum

                                                  sum

                                                  th window after the timeidle 2 1 kRSRTT

                                                  RS k =⎥⎦

                                                  ⎤⎢⎣⎡ minus+

                                                  +minus

                                                  window kth the transmit totime2 1 =minus

                                                  RSk

                                                  RTT

                                                  initiate TCPconnection

                                                  requestobject

                                                  first window= SR

                                                  second window= 2SR

                                                  third window= 4SR

                                                  fourth window= 8SR

                                                  completetransmissionobject

                                                  delivered

                                                  time atclient

                                                  time atserver

                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                  How do we calculate K

                                                  ⎥⎥⎤

                                                  ⎢⎢⎡ +=

                                                  +ge=

                                                  geminus=

                                                  ge+++=

                                                  ge+++=minus

                                                  minus

                                                  )1(log

                                                  )1(logmin

                                                  12min

                                                  222min222min

                                                  2

                                                  2

                                                  110

                                                  110

                                                  SO

                                                  SOkk

                                                  SOk

                                                  SOkOSSSkK

                                                  k

                                                  k

                                                  k

                                                  L

                                                  L

                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                  HTTP ModelingAssume Web page consists of

                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                  02468

                                                  101214161820

                                                  28Kbps

                                                  100Kbps

                                                  1 Mbps 10Mbps

                                                  non-persistent

                                                  persistent

                                                  parallel non-persistent

                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                  HTTP Response time (in seconds)

                                                  0

                                                  10

                                                  20

                                                  30

                                                  40

                                                  50

                                                  60

                                                  70

                                                  28Kbps

                                                  100Kbps

                                                  1 Mbps 10Mbps

                                                  non-persistent

                                                  persistent

                                                  parallel non-persistent

                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                  instantiation and implementation in the Internet

                                                  UDPTCP

                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                  • Chapter 3 Transport Layer last revised 160305
                                                  • Chapter 3 outline
                                                  • Transport services and protocols
                                                  • Transport vs network layer
                                                  • Transport-layer protocols
                                                  • Chapter 3 outline
                                                  • Multiplexingdemultiplexing
                                                  • Multiplexingdemultiplexing
                                                  • How demultiplexing works
                                                  • Connectionless demultiplexing
                                                  • Connectionless demux (cont)
                                                  • Connection-oriented demux
                                                  • Connection-oriented demux (cont)
                                                  • Connection-oriented demux Threaded Web Server
                                                  • Chapter 3 outline
                                                  • UDP User Datagram Protocol [RFC 768]
                                                  • UDP more
                                                  • UDP checksum
                                                  • Chapter 3 outline
                                                  • Principles of Reliable data transfer
                                                  • Reliable data transfer getting started
                                                  • Reliable data transfer getting started
                                                  • Incremental Improvements
                                                  • Rdt10 reliable transfer over a reliable channel
                                                  • Rdt20 channel with bit errors
                                                  • rdt20 FSM specification
                                                  • rdt20 operation with no errors
                                                  • rdt20 error scenario
                                                  • rdt20 has a fatal flaw
                                                  • rdt21 sender handles garbled ACKNAKs
                                                  • rdt21 receiver handles garbled ACKNAKs
                                                  • rdt21 discussion
                                                  • rdt22 a NAK-free protocol
                                                  • rdt22 sender receiver fragments
                                                  • rdt30 channels with errors and loss
                                                  • rdt30 sender
                                                  • rdt30 in action
                                                  • rdt30 in action
                                                  • Performance of rdt30
                                                  • rdt30 stop-and-wait operation
                                                  • Pipelined protocols
                                                  • Pipelined protocols
                                                  • Pipelining increased utilization
                                                  • Go-Back-N
                                                  • GBN Sender
                                                  • GBN sender extended FSM
                                                  • GBN receiver extended FSM
                                                  • More on receiver
                                                  • GBN inaction
                                                  • Selective Repeat
                                                  • Selective repeat sender receiver windows
                                                  • Selective repeat
                                                  • Selective repeat in action
                                                  • Selective repeat dilemma
                                                  • Chapter 3 outline
                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                  • More TCP Details
                                                  • Even More TCP Details
                                                  • TCP segment structure
                                                  • TCP seq rsquos and ACKs
                                                  • TCP Round Trip Time and Timeout
                                                  • TCP Round Trip Time and Timeout
                                                  • Example RTT estimation
                                                  • TCP Round Trip Time and Timeout
                                                  • Chapter 3 outline
                                                  • TCP reliable data transfer
                                                  • TCP sender events
                                                  • TCP sender(simplified)
                                                  • TCP retransmission scenarios
                                                  • TCP retransmission scenarios (more)
                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                  • More on Sender Policies
                                                  • Fast Retransmit
                                                  • Fast retransmit algorithm
                                                  • TCP GBN or Selective Repeat
                                                  • Chapter 3 outline
                                                  • TCP Flow Control
                                                  • TCP Flow Control
                                                  • TCP segment structure
                                                  • TCP Flow control how it works
                                                  • Technical Issue
                                                  • Chapter 3 outline
                                                  • TCP Connection Management
                                                  • TCP Connection Management (cont)
                                                  • TCP Connection Management (cont)
                                                  • TCP Connection Management (cont)
                                                  • TCP Connection Management (cont)
                                                  • A few special cases
                                                  • Chapter 3 outline
                                                  • Principles of Congestion Control
                                                  • Causescosts of congestion scenario 1
                                                  • Causescosts of congestion scenario 2
                                                  • Causescosts of congestion scenario 3
                                                  • Causescosts of congestion scenario 3
                                                  • Approaches towards congestion control
                                                  • Case study ATM ABR congestion control
                                                  • Case study ATM ABR congestion control
                                                  • Chapter 3 outline
                                                  • TCP Congestion Control
                                                  • TCP AIMD
                                                  • TCP Slow Start
                                                  • TCP Slow Start (more)
                                                  • Summary TCP Congestion Control
                                                  • The Big Picture
                                                  • TCP sender congestion control
                                                  • TCP throughput
                                                  • TCP Futures
                                                  • TCP Fairness
                                                  • Why is TCP fair
                                                  • Fairness (more)
                                                  • TCP Latency Modeling
                                                  • Fixed Congestion Window (W)
                                                  • Fixed congestion window (1)
                                                  • Fixed congestion window (2)
                                                  • TCP Latency Modeling Slow Start (1)
                                                  • TCP Latency Modeling Slow Start (2)
                                                  • TCP Latency Modeling (3)
                                                  • TCP Latency Modeling (4)
                                                  • HTTP Modeling
                                                  • Chapter 3 Summary

                                                    3 Transport Layer 26Comp 361 Spring 2005

                                                    rdt20 FSM specification

                                                    Wait for call from above

                                                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                    udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                    udt_send(NAK)

                                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                    Wait for ACK or

                                                    NAK

                                                    rdt_send(data)

                                                    receiver

                                                    Wait for call from

                                                    below

                                                    Λ

                                                    sender

                                                    3 Transport Layer 27Comp 361 Spring 2005

                                                    rdt20 operation with no errors

                                                    Wait for call from above

                                                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                    udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                    udt_send(NAK)

                                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                    Wait for ACK or

                                                    NAK

                                                    Wait for call from

                                                    below

                                                    rdt_send(data)

                                                    Λ

                                                    3 Transport Layer 28Comp 361 Spring 2005

                                                    rdt20 error scenario

                                                    Wait for call from above

                                                    snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                    extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                    rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                    udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                    udt_send(NAK)

                                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                    Wait for ACK or

                                                    NAK

                                                    Wait for call from

                                                    below

                                                    rdt_send(data)

                                                    Λ

                                                    3 Transport Layer 29Comp 361 Spring 2005

                                                    rdt20 has a fatal flawWhat happens if ACKNAK

                                                    corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                                    What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                                    Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                                    Sender sends one packet then waits for receiver response

                                                    stop and wait

                                                    3 Transport Layer 30Comp 361 Spring 2005

                                                    Sender whenever sender receives control message it sends a packet to receiver

                                                    A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                    Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                    Note ACKNAK do not contain sequence

                                                    3 Transport Layer 31Comp 361 Spring 2005

                                                    rdt21 sender handles garbled ACKNAKs

                                                    Wait for call 0 from

                                                    above

                                                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                    rdt_send(data)

                                                    Wait for ACK or NAK 0 udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                    rdt_send(data)

                                                    udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                    Wait forcall 1 from

                                                    above

                                                    Wait for ACK or NAK 1

                                                    ΛΛ

                                                    3 Transport Layer 32Comp 361 Spring 2005

                                                    rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                    ampamp has_seq0(rcvpkt)

                                                    Wait for 0 from below

                                                    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                    Wait for 1 from below

                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                    sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                    sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                    3 Transport Layer 33Comp 361 Spring 2005

                                                    rdt21 discussion

                                                    Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                    state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                    Receivermust check if received packet is duplicate

                                                    state indicates whether 0 or 1 is expected pkt seq

                                                    note receiver can notknow if its last ACKNAK received OK at sender

                                                    3 Transport Layer 34Comp 361 Spring 2005

                                                    rdt22 a NAK-free protocol

                                                    same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                    receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                    duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                    3 Transport Layer 35Comp 361 Spring 2005

                                                    rdt22 sender receiver fragments

                                                    Wait for call 0 from

                                                    above

                                                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                    rdt_send(data)

                                                    udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                    isACK(rcvpkt1) )

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                    Wait for ACK

                                                    0sender FSM

                                                    fragment

                                                    Wait for 0 from below

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                    has_seq1(rcvpkt))

                                                    udt_send(sndpkt)receiver FSM

                                                    fragment

                                                    Λ

                                                    3 Transport Layer 36Comp 361 Spring 2005

                                                    rdt30 channels with errors and loss

                                                    New assumptionunderlying channel can also lose packets (data or ACKs)

                                                    checksum seq ACKs retransmissions will be of help but not enough

                                                    Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                    Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                    retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                    requires countdown timer

                                                    3 Transport Layer 37Comp 361 Spring 2005

                                                    rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                    rdt_send(data)

                                                    Wait for

                                                    ACK0

                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                    Wait for call 1 from

                                                    above

                                                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                    rdt_send(data)

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                    stop_timerstop_timer

                                                    udt_send(sndpkt)start_timer

                                                    timeout

                                                    udt_send(sndpkt)start_timer

                                                    timeout

                                                    rdt_rcv(rcvpkt)

                                                    Wait for call 0from

                                                    above

                                                    Wait for

                                                    ACK1

                                                    Λrdt_rcv(rcvpkt)

                                                    ΛΛ

                                                    Λ

                                                    3 Transport Layer 38Comp 361 Spring 2005

                                                    rdt30 in action

                                                    3 Transport Layer 39Comp 361 Spring 2005

                                                    rdt30 in action

                                                    3 Transport Layer 40Comp 361 Spring 2005

                                                    Performance of rdt30

                                                    rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                    L (packet length in bits)R (transmission rate bps)

                                                    8kbpkt109 bsec

                                                    Ttransmit = = = 8 microsec

                                                    U sender =

                                                    00830008

                                                    = 000027 L R RTT + L R

                                                    =

                                                    U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                    rdt30 stop-and-wait operation

                                                    first packet bit transmitted t = 0

                                                    sender receiver

                                                    RTT

                                                    last packet bit transmitted t = L R

                                                    first packet bit arriveslast packet bit arrives send ACK

                                                    ACK arrives send next packet t = RTT + L R

                                                    U sender =

                                                    008 30008

                                                    = 000027 L R RTT + L R

                                                    =

                                                    3 Transport Layer 41Comp 361 Spring 2005

                                                    3 Transport Layer 42Comp 361 Spring 2005

                                                    Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                    range of sequence numbers must be increasedbuffering at sender andor receiver

                                                    3 Transport Layer 43Comp 361 Spring 2005

                                                    Pipelined protocols

                                                    Advantage much better bandwidth utilization than stop-and-wait

                                                    Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                    Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                    Note TCP is not exactly either

                                                    Pipelining increased utilization

                                                    first packet bit transmitted t = 0

                                                    sender receiver

                                                    RTT

                                                    last bit transmitted t = L R

                                                    first packet bit arriveslast packet bit arrives send ACK

                                                    ACK arrives send next packet t = RTT + L R

                                                    last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                    U sender =

                                                    02430008

                                                    = 00008 3 L R RTT + L R

                                                    =

                                                    Increase utilizationby a factor of 3

                                                    3 Transport Layer 44Comp 361 Spring 2005

                                                    3 Transport Layer 45Comp 361 Spring 2005

                                                    Go-Back-NSender

                                                    k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                    ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                    Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                    3 Transport Layer 46Comp 361 Spring 2005

                                                    GBN Sender

                                                    rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                    Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                    Timeout resends ALL packets that have been sent but not yet acknowledged

                                                    This is only event that triggers resend

                                                    3 Transport Layer 47Comp 361 Spring 2005

                                                    GBN sender extended FSMrdt_send(data)

                                                    Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                    timeout

                                                    if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                    start_timernextseqnum++

                                                    elserefuse_data(data)

                                                    base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                    stop_timerelse

                                                    start_timer

                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                    base=1nextseqnum=1

                                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                    Λ

                                                    3 Transport Layer 48Comp 361 Spring 2005

                                                    GBN receiver extended FSM

                                                    Wait

                                                    udt_send(sndpkt)default

                                                    rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                    expectedseqnum=1sndpkt =

                                                    make_pkt(0ACKchksum)

                                                    Λ

                                                    If expected packet receivedSend ACK and deliver packet upstairs

                                                    If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                    3 Transport Layer 49Comp 361 Spring 2005

                                                    More on receiver

                                                    The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                    3 Transport Layer 50Comp 361 Spring 2005

                                                    GBN inaction

                                                    GBN is easy to code but might have performance problems

                                                    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                    3 Transport Layer 51Comp 361 Spring 2005

                                                    3 Transport Layer 52Comp 361 Spring 2005

                                                    Selective Repeat

                                                    receiver individually acknowledges all correctly received pkts

                                                    buffers pkts as needed for eventual in-order delivery to upper layer

                                                    sender only resends pkts for which ACK not received

                                                    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                    3 Transport Layer 53Comp 361 Spring 2005

                                                    Selective repeat sender receiver windows

                                                    3 Transport Layer 54Comp 361 Spring 2005

                                                    Selective repeat

                                                    pkt n in [rcvbase rcvbase+N-1]

                                                    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                    pkt n in [rcvbase-Nrcvbase-1]

                                                    ACK(n) (note this is a reACK)

                                                    otherwiseignore

                                                    receiverdata from above

                                                    if next available seq in window send pkt

                                                    timeout(n)resend pkt n restart timer

                                                    ACK(n) in [sendbasesendbase+N]

                                                    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                    sender

                                                    3 Transport Layer 55Comp 361 Spring 2005

                                                    Selective repeat in action

                                                    3 Transport Layer 56Comp 361 Spring 2005

                                                    Selective repeatdilemma

                                                    Example seq rsquos 0 1 2 3window size=3

                                                    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                    Q what is relationship between seq size and window size

                                                    3 Transport Layer 57Comp 361 Spring 2005

                                                    Chapter 3 outline

                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                    35 Connection-oriented transport TCP

                                                    segment structurereliable data transferflow controlconnection management

                                                    36 Principles of congestion control37 TCP congestion control

                                                    3 Transport Layer 58Comp 361 Spring 2005

                                                    TCP Overview RFCs 793 1122 1323 2018 2581

                                                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                    flow controlledsender will not overwhelm receiver

                                                    point-to-pointone sender one receiver

                                                    reliable in-order byte steam

                                                    no ldquomessage boundariesrdquopipelined

                                                    TCP congestion and flow control set window size

                                                    send amp receive buffers

                                                    socketdoor

                                                    TCPsend buffer

                                                    TCPreceive buffer

                                                    socketdoor

                                                    segment

                                                    applicationwrites data

                                                    applicationreads data

                                                    3 Transport Layer 59Comp 361 Spring 2005

                                                    More TCP DetailsMaximum Segment Size (MSS)

                                                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                    Application Data + TCP Header = TCP Segment

                                                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                    (again no payload)Client responds with third special segment

                                                    This can contain payload

                                                    3 Transport Layer 60Comp 361 Spring 2005

                                                    Even More TCP Details

                                                    A TCP connection between client and server creates in both client and server

                                                    (i) buffers(ii) variables and

                                                    (iii) a socket connection to process

                                                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                    any of the network elements between the host and server

                                                    3 Transport Layer 61Comp 361 Spring 2005

                                                    TCP segment structure

                                                    source port dest port

                                                    32 bits

                                                    applicationdata

                                                    (variable length)

                                                    sequence numberacknowledgement number

                                                    Receive windowUrg data pnterchecksum

                                                    FSRPAUheadlen

                                                    notused

                                                    Options (variable length)

                                                    URG urgent data (generally not used)

                                                    ACK ACK valid

                                                    PSH push data now(generally not used)

                                                    RST SYN FINconnection estab(setup teardown

                                                    commands)

                                                    bytes rcvr willingto accept

                                                    Internetchecksum

                                                    (as in UDP)

                                                    countingby bytes of data(not segments)

                                                    3 Transport Layer 62Comp 361 Spring 2005

                                                    TCP seq rsquos and ACKsSeq rsquos

                                                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                    ACKsseq of next byte expected from other sidecumulative ACK

                                                    Q how receiver handles out-of-order segments

                                                    A TCP spec doesnrsquot say - up to implementer

                                                    Host BHost A

                                                    Seq=42 ACK=79 data = lsquoCrsquo

                                                    Seq=79 ACK=43 data = lsquoCrsquo

                                                    Seq=43 ACK=80

                                                    Usertypes

                                                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                    back lsquoCrsquo

                                                    host ACKsreceipt

                                                    of echoedlsquoCrsquo

                                                    timesimple telnet scenario

                                                    3 Transport Layer 63Comp 361 Spring 2005

                                                    TCP Round Trip Time and Timeout

                                                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                    average several recent measurements not just current SampleRTT

                                                    Q how to set TCP timeout valuelonger than RTT

                                                    but RTT variestoo short premature timeout

                                                    unnecessary retransmissions

                                                    too long slow reaction to segment loss

                                                    3 Transport Layer 64Comp 361 Spring 2005

                                                    TCP Round Trip Time and Timeout

                                                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                    3 Transport Layer 65Comp 361 Spring 2005

                                                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                    100

                                                    150

                                                    200

                                                    250

                                                    300

                                                    350

                                                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                    time (seconnds)

                                                    RTT

                                                    (mill

                                                    iseco

                                                    nds)

                                                    SampleRTT Estimated RTT

                                                    3 Transport Layer 66Comp 361 Spring 2005

                                                    TCP Round Trip Time and Timeout

                                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                    (typically β = 025)

                                                    Then set timeout interval

                                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                                    3 Transport Layer 67Comp 361 Spring 2005

                                                    Chapter 3 outline

                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                    35 Connection-oriented transport TCP

                                                    segment structurereliable data transferflow controlconnection management

                                                    36 Principles of congestion control37 TCP congestion control

                                                    3 Transport Layer 68Comp 361 Spring 2005

                                                    TCP reliable data transfer

                                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                    Retransmissions are triggered by

                                                    timeout eventsduplicate acks

                                                    Initially consider simplified TCP sender

                                                    ignore duplicate acksignore flow control congestion control

                                                    3 Transport Layer 69Comp 361 Spring 2005

                                                    TCP sender eventsdata rcvd from app

                                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                    timeoutretransmit segment that caused timeoutrestart timer

                                                    Ack rcvdIf acknowledges previously unackedsegments

                                                    update what is known to be ackedstart timer if there are outstanding segments

                                                    TCP sender(simplified)

                                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                    loop (forever) switch(event)

                                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                                    smallest sequence numberstart timer

                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                    start timer

                                                    end of loop forever

                                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                    3 Transport Layer 70Comp 361 Spring 2005

                                                    3 Transport Layer 71Comp 361 Spring 2005

                                                    TCP retransmission scenariosHost A

                                                    Seq=100 20 bytes data

                                                    ACK=100

                                                    timepremature timeout

                                                    Host B

                                                    Seq=92 8 bytes data

                                                    ACK=120

                                                    Seq=92 8 bytes data

                                                    Seq=

                                                    92 t

                                                    imeo

                                                    ut

                                                    ACK=120

                                                    Host A

                                                    Seq=92 8 bytes data

                                                    ACK=100

                                                    loss

                                                    tim

                                                    eout

                                                    lost ACK scenario

                                                    Host B

                                                    X

                                                    Seq=92 8 bytes data

                                                    ACK=100

                                                    time

                                                    SendBase= 120

                                                    SendBase= 120

                                                    Sendbase= 100

                                                    Seq=

                                                    92 t

                                                    imeo

                                                    utSendBase

                                                    = 100

                                                    3 Transport Layer 72Comp 361 Spring 2005

                                                    TCP retransmission scenarios (more)Host A

                                                    Seq=92 8 bytes data

                                                    ACK=100

                                                    loss

                                                    tim

                                                    eout

                                                    Cumulative ACK scenario

                                                    Host B

                                                    X

                                                    Seq=100 20 bytes data

                                                    ACK=120

                                                    time

                                                    SendBase= 120

                                                    3 Transport Layer 73Comp 361 Spring 2005

                                                    TCP ACK generation [RFC 1122 RFC 2581]

                                                    Event at Receiver

                                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                    Arrival of segment that partially or completely fills gap

                                                    TCP Receiver action

                                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                    Immediately send single cumulative ACK ACKing both in-order segments

                                                    Immediately send duplicate ACK indicating seq of next expected byte

                                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                    More on Sender Policies

                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                    Fast Retransmit

                                                    Time-out period often relatively long

                                                    long delay before resending lost packet

                                                    Detect lost segments via duplicate ACKs

                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                    fast retransmit resend segment before timer expires

                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                    Fast retransmit algorithm

                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                    start timer

                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                    resend segment with sequence number y

                                                    a duplicate ACK for already ACKed segment

                                                    fast retransmit

                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                    TCP GBN or Selective Repeat

                                                    Basic TCP looks a lot like GBN

                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                    This looks a lot like Selective Repeat

                                                    TCP is a hybrid

                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                    Chapter 3 outline

                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                    35 Connection-oriented transport TCP

                                                    segment structurereliable data transferflow controlconnection management

                                                    36 Principles of congestion control37 TCP congestion control

                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                    TCP Flow Control

                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                    transmitting too muchtoo fast

                                                    flow controlreceive side of TCP connection has a receive buffer

                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                    app process may be slow at reading from buffer

                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                    TCP segment structure

                                                    source port dest port

                                                    32 bits

                                                    applicationdata

                                                    (variable length)

                                                    sequence numberacknowledgement number

                                                    Receive windowUrg data pnterchecksum

                                                    FSRPAUheadlen

                                                    notused

                                                    Options (variable length)

                                                    URG urgent data (generally not used)

                                                    ACK ACK valid

                                                    PSH push data now(generally not used)

                                                    RST SYN FINconnection estab(setup teardown

                                                    commands)

                                                    bytes rcvr willingto accept

                                                    Internetchecksum

                                                    (as in UDP)

                                                    countingby bytes of data(not segments)

                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                    TCP Flow control how it works

                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                    LastByteRead]

                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                    guarantees receive buffer doesnrsquot overflow

                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                    Technical Issue

                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                    Note on UDP

                                                    UDP has no flow control

                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                    Chapter 3 outline

                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                    35 Connection-oriented transport TCP

                                                    segment structurereliable data transferflow controlconnection management

                                                    36 Principles of congestion control37 TCP congestion control

                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                    TCP Connection Management

                                                    Three way handshakeStep 1 client end system sends

                                                    TCP SYN control segment to server

                                                    specifies client_isn the initial seq No application data

                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                    seq sbuffers flow control info (eg RcvWindow)

                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                    TCP Connection Management (cont)

                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                    Allocate buffersAllocates buffersCan include application data

                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                    clientConnection request (SYN=1 seq=client_isn)

                                                    server

                                                    Connection granted (SYN=1 server_isn

                                                    ACK (SYN=0 seq=client_isn+1)

                                                    ack=client_isn+1)

                                                    ack=server_isn+1

                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                    TCP Connection Management (cont)

                                                    Closing a connection

                                                    client closes socketclientSocketclose()

                                                    Step 1 client end system sends TCP FIN control segment to server

                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                    client

                                                    FIN

                                                    server

                                                    ACK

                                                    ACK

                                                    FIN

                                                    close

                                                    close

                                                    closed

                                                    tim

                                                    ed w

                                                    ait

                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                    TCP Connection Management (cont)

                                                    Step 3 client receives FIN replies with ACK

                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                    Closes down after timed-wait

                                                    Step 4 server receives ACK Connection closed

                                                    Note with small modification can handle simultaneous FINs

                                                    client

                                                    FIN

                                                    server

                                                    ACK

                                                    ACK

                                                    FIN

                                                    closing

                                                    closing

                                                    closed

                                                    tim

                                                    ed w

                                                    ait

                                                    closed

                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                    TCP Connection Management (cont)

                                                    ExampleTCP serverlifecycle

                                                    Example TCP clientlifecycle

                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                    A few special cases

                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                    Chapter 3 outline

                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                    35 Connection-oriented transport TCP

                                                    segment structurereliable data transferflow controlconnection management

                                                    36 Principles of congestion control37 TCP congestion control

                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                    Principles of Congestion Control

                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                    a top-10 problem

                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                    large delays when congestedmaximum achievable throughput

                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                    Causescosts of congestion scenario 2

                                                    one router finite buffers sender retransmission of lost packet

                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                    λin λout=

                                                    λin λoutgtλ

                                                    inλout

                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                    (c)(a) (b)

                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                    λin

                                                    Q what happens as and increase λ

                                                    in

                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                    Causescosts of congestion scenario 3

                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                    Approaches towards congestion control

                                                    Two broad approaches towards congestion control

                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                    Case study ATM ABR congestion control

                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                    RM cells returned to sender by receiver with bits intact

                                                    small exception ndash see next page

                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                    sender should use available bandwidth

                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                    Case study ATM ABR congestion control

                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                    Chapter 3 outline

                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                    35 Connection-oriented transport TCP

                                                    segment structurereliable data transferflow controlconnection management

                                                    36 Principles of congestion control37 TCP congestion control

                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                    Congwin

                                                    w segments each with MSS bytes sent in one RTT

                                                    throughput = w MSSRTT Bytessec

                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                    LastByteSent-LastByteAcked le CongWin

                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                    cut CongWin in half after loss event

                                                    8 Kbytes

                                                    16 Kbytes

                                                    24 Kbytes

                                                    time

                                                    congestionwindow

                                                    Long-lived TCP connection

                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                    TCP Slow Start

                                                    When connection begins CongWin = 1 MSS

                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                    available bandwidth may be gtgt MSSRTT

                                                    desirable to quickly ramp up to respectable rate

                                                    When connection begins increase rate exponentially fast until first loss event

                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                    TCP Slow Start (more)

                                                    When connection begins increase rate exponentially until first loss event

                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                    Summary initial rate is slow but ramps up exponentially fast

                                                    Host A

                                                    one segment

                                                    RTT

                                                    Host B

                                                    time

                                                    two segments

                                                    four segments

                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                    Summary TCP Congestion Control

                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                    The Big Picture

                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                    ACK receipt for previously unackeddata

                                                    Slow Start (SS)

                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                    set state to ldquoCongestion Avoidancerdquo

                                                    Resulting in a doubling of CongWin every RTT

                                                    ACK receipt for previously unackeddata

                                                    CongestionAvoidance (CA)

                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                    Loss event detected by triple duplicate ACK

                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                    Enter slow start

                                                    Duplicate ACK

                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                    CongWin and Threshold not changed

                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                    TCP throughput

                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                    TCP Futures

                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                    LRTTMSSsdot221

                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                    TCP connection 1

                                                    bottleneckrouter

                                                    capacity R

                                                    TCP connection 2

                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                    Why is TCP fairTwo competing sessions

                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                    R

                                                    R

                                                    equal bandwidth share

                                                    Connection 1 throughput

                                                    Conn

                                                    ecti

                                                    on 2

                                                    thr

                                                    ough

                                                    p ut

                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                    Fairness (more)Fairness and UDP

                                                    Multimedia apps often do not use TCP

                                                    do not want rate throttled by congestion control

                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                    Current Research area How to keep UDP from congesting the internet

                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                    TCP Latency ModelingNotation assumptions

                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                    modeling slow start

                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                    Fixed Congestion Window (W)Two cases

                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                    Fixed congestion window (1)

                                                    First caseWSR gt RTT + SR ACK for

                                                    first segment in window returns before windowrsquos worth of data sent

                                                    latency = 2RTT + OR

                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                    Fixed congestion window (2)

                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                    TCP Latency Modeling Slow Start (1)

                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                    Will show that the delay for one object is

                                                    RS

                                                    RSRTTP

                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                    ⎤⎢⎣⎡ +++=

                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                    - and K is the number of windows that cover the object

                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                    TCP Latency Modeling Slow Start (2)

                                                    RTT

                                                    initiate TCPconnection

                                                    requestobject

                                                    first window= SR

                                                    second window= 2SR

                                                    third window= 4SR

                                                    fourth window= 8SR

                                                    completetransmissionobject

                                                    delivered

                                                    time atclient

                                                    time atserver

                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                    Server idles P=2 times

                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                    Server idles P = minK-1Q times

                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                    TCP Latency Modeling (3)

                                                    ementacknowledg receivesserver until

                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                    RS

                                                    RSRTTPRTT

                                                    RO

                                                    RSRTT

                                                    RSRTT

                                                    RO

                                                    idleTimeRTTRO

                                                    P

                                                    kP

                                                    k

                                                    P

                                                    pp

                                                    )12(][2

                                                    ]2[2

                                                    2delay

                                                    1

                                                    1

                                                    1

                                                    minusminus+++=

                                                    minus+++=

                                                    ++=

                                                    minus

                                                    =

                                                    =

                                                    sum

                                                    sum

                                                    th window after the timeidle 2 1 kRSRTT

                                                    RS k =⎥⎦

                                                    ⎤⎢⎣⎡ minus+

                                                    +minus

                                                    window kth the transmit totime2 1 =minus

                                                    RSk

                                                    RTT

                                                    initiate TCPconnection

                                                    requestobject

                                                    first window= SR

                                                    second window= 2SR

                                                    third window= 4SR

                                                    fourth window= 8SR

                                                    completetransmissionobject

                                                    delivered

                                                    time atclient

                                                    time atserver

                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                    How do we calculate K

                                                    ⎥⎥⎤

                                                    ⎢⎢⎡ +=

                                                    +ge=

                                                    geminus=

                                                    ge+++=

                                                    ge+++=minus

                                                    minus

                                                    )1(log

                                                    )1(logmin

                                                    12min

                                                    222min222min

                                                    2

                                                    2

                                                    110

                                                    110

                                                    SO

                                                    SOkk

                                                    SOk

                                                    SOkOSSSkK

                                                    k

                                                    k

                                                    k

                                                    L

                                                    L

                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                    HTTP ModelingAssume Web page consists of

                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                    02468

                                                    101214161820

                                                    28Kbps

                                                    100Kbps

                                                    1 Mbps 10Mbps

                                                    non-persistent

                                                    persistent

                                                    parallel non-persistent

                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                    HTTP Response time (in seconds)

                                                    0

                                                    10

                                                    20

                                                    30

                                                    40

                                                    50

                                                    60

                                                    70

                                                    28Kbps

                                                    100Kbps

                                                    1 Mbps 10Mbps

                                                    non-persistent

                                                    persistent

                                                    parallel non-persistent

                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                    instantiation and implementation in the Internet

                                                    UDPTCP

                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                    • Chapter 3 Transport Layer last revised 160305
                                                    • Chapter 3 outline
                                                    • Transport services and protocols
                                                    • Transport vs network layer
                                                    • Transport-layer protocols
                                                    • Chapter 3 outline
                                                    • Multiplexingdemultiplexing
                                                    • Multiplexingdemultiplexing
                                                    • How demultiplexing works
                                                    • Connectionless demultiplexing
                                                    • Connectionless demux (cont)
                                                    • Connection-oriented demux
                                                    • Connection-oriented demux (cont)
                                                    • Connection-oriented demux Threaded Web Server
                                                    • Chapter 3 outline
                                                    • UDP User Datagram Protocol [RFC 768]
                                                    • UDP more
                                                    • UDP checksum
                                                    • Chapter 3 outline
                                                    • Principles of Reliable data transfer
                                                    • Reliable data transfer getting started
                                                    • Reliable data transfer getting started
                                                    • Incremental Improvements
                                                    • Rdt10 reliable transfer over a reliable channel
                                                    • Rdt20 channel with bit errors
                                                    • rdt20 FSM specification
                                                    • rdt20 operation with no errors
                                                    • rdt20 error scenario
                                                    • rdt20 has a fatal flaw
                                                    • rdt21 sender handles garbled ACKNAKs
                                                    • rdt21 receiver handles garbled ACKNAKs
                                                    • rdt21 discussion
                                                    • rdt22 a NAK-free protocol
                                                    • rdt22 sender receiver fragments
                                                    • rdt30 channels with errors and loss
                                                    • rdt30 sender
                                                    • rdt30 in action
                                                    • rdt30 in action
                                                    • Performance of rdt30
                                                    • rdt30 stop-and-wait operation
                                                    • Pipelined protocols
                                                    • Pipelined protocols
                                                    • Pipelining increased utilization
                                                    • Go-Back-N
                                                    • GBN Sender
                                                    • GBN sender extended FSM
                                                    • GBN receiver extended FSM
                                                    • More on receiver
                                                    • GBN inaction
                                                    • Selective Repeat
                                                    • Selective repeat sender receiver windows
                                                    • Selective repeat
                                                    • Selective repeat in action
                                                    • Selective repeat dilemma
                                                    • Chapter 3 outline
                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                    • More TCP Details
                                                    • Even More TCP Details
                                                    • TCP segment structure
                                                    • TCP seq rsquos and ACKs
                                                    • TCP Round Trip Time and Timeout
                                                    • TCP Round Trip Time and Timeout
                                                    • Example RTT estimation
                                                    • TCP Round Trip Time and Timeout
                                                    • Chapter 3 outline
                                                    • TCP reliable data transfer
                                                    • TCP sender events
                                                    • TCP sender(simplified)
                                                    • TCP retransmission scenarios
                                                    • TCP retransmission scenarios (more)
                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                    • More on Sender Policies
                                                    • Fast Retransmit
                                                    • Fast retransmit algorithm
                                                    • TCP GBN or Selective Repeat
                                                    • Chapter 3 outline
                                                    • TCP Flow Control
                                                    • TCP Flow Control
                                                    • TCP segment structure
                                                    • TCP Flow control how it works
                                                    • Technical Issue
                                                    • Chapter 3 outline
                                                    • TCP Connection Management
                                                    • TCP Connection Management (cont)
                                                    • TCP Connection Management (cont)
                                                    • TCP Connection Management (cont)
                                                    • TCP Connection Management (cont)
                                                    • A few special cases
                                                    • Chapter 3 outline
                                                    • Principles of Congestion Control
                                                    • Causescosts of congestion scenario 1
                                                    • Causescosts of congestion scenario 2
                                                    • Causescosts of congestion scenario 3
                                                    • Causescosts of congestion scenario 3
                                                    • Approaches towards congestion control
                                                    • Case study ATM ABR congestion control
                                                    • Case study ATM ABR congestion control
                                                    • Chapter 3 outline
                                                    • TCP Congestion Control
                                                    • TCP AIMD
                                                    • TCP Slow Start
                                                    • TCP Slow Start (more)
                                                    • Summary TCP Congestion Control
                                                    • The Big Picture
                                                    • TCP sender congestion control
                                                    • TCP throughput
                                                    • TCP Futures
                                                    • TCP Fairness
                                                    • Why is TCP fair
                                                    • Fairness (more)
                                                    • TCP Latency Modeling
                                                    • Fixed Congestion Window (W)
                                                    • Fixed congestion window (1)
                                                    • Fixed congestion window (2)
                                                    • TCP Latency Modeling Slow Start (1)
                                                    • TCP Latency Modeling Slow Start (2)
                                                    • TCP Latency Modeling (3)
                                                    • TCP Latency Modeling (4)
                                                    • HTTP Modeling
                                                    • Chapter 3 Summary

                                                      3 Transport Layer 27Comp 361 Spring 2005

                                                      rdt20 operation with no errors

                                                      Wait for call from above

                                                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                      udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                      udt_send(NAK)

                                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                      Wait for ACK or

                                                      NAK

                                                      Wait for call from

                                                      below

                                                      rdt_send(data)

                                                      Λ

                                                      3 Transport Layer 28Comp 361 Spring 2005

                                                      rdt20 error scenario

                                                      Wait for call from above

                                                      snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                      extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                      rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                      udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                      udt_send(NAK)

                                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                      Wait for ACK or

                                                      NAK

                                                      Wait for call from

                                                      below

                                                      rdt_send(data)

                                                      Λ

                                                      3 Transport Layer 29Comp 361 Spring 2005

                                                      rdt20 has a fatal flawWhat happens if ACKNAK

                                                      corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                                      What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                                      Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                                      Sender sends one packet then waits for receiver response

                                                      stop and wait

                                                      3 Transport Layer 30Comp 361 Spring 2005

                                                      Sender whenever sender receives control message it sends a packet to receiver

                                                      A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                      Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                      Note ACKNAK do not contain sequence

                                                      3 Transport Layer 31Comp 361 Spring 2005

                                                      rdt21 sender handles garbled ACKNAKs

                                                      Wait for call 0 from

                                                      above

                                                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                      rdt_send(data)

                                                      Wait for ACK or NAK 0 udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                      rdt_send(data)

                                                      udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                      Wait forcall 1 from

                                                      above

                                                      Wait for ACK or NAK 1

                                                      ΛΛ

                                                      3 Transport Layer 32Comp 361 Spring 2005

                                                      rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                      ampamp has_seq0(rcvpkt)

                                                      Wait for 0 from below

                                                      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                      Wait for 1 from below

                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                      sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                      sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                      3 Transport Layer 33Comp 361 Spring 2005

                                                      rdt21 discussion

                                                      Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                      state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                      Receivermust check if received packet is duplicate

                                                      state indicates whether 0 or 1 is expected pkt seq

                                                      note receiver can notknow if its last ACKNAK received OK at sender

                                                      3 Transport Layer 34Comp 361 Spring 2005

                                                      rdt22 a NAK-free protocol

                                                      same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                      receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                      duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                      3 Transport Layer 35Comp 361 Spring 2005

                                                      rdt22 sender receiver fragments

                                                      Wait for call 0 from

                                                      above

                                                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                      rdt_send(data)

                                                      udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                      isACK(rcvpkt1) )

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                      Wait for ACK

                                                      0sender FSM

                                                      fragment

                                                      Wait for 0 from below

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                      has_seq1(rcvpkt))

                                                      udt_send(sndpkt)receiver FSM

                                                      fragment

                                                      Λ

                                                      3 Transport Layer 36Comp 361 Spring 2005

                                                      rdt30 channels with errors and loss

                                                      New assumptionunderlying channel can also lose packets (data or ACKs)

                                                      checksum seq ACKs retransmissions will be of help but not enough

                                                      Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                      Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                      retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                      requires countdown timer

                                                      3 Transport Layer 37Comp 361 Spring 2005

                                                      rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                      rdt_send(data)

                                                      Wait for

                                                      ACK0

                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                      Wait for call 1 from

                                                      above

                                                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                      rdt_send(data)

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                      stop_timerstop_timer

                                                      udt_send(sndpkt)start_timer

                                                      timeout

                                                      udt_send(sndpkt)start_timer

                                                      timeout

                                                      rdt_rcv(rcvpkt)

                                                      Wait for call 0from

                                                      above

                                                      Wait for

                                                      ACK1

                                                      Λrdt_rcv(rcvpkt)

                                                      ΛΛ

                                                      Λ

                                                      3 Transport Layer 38Comp 361 Spring 2005

                                                      rdt30 in action

                                                      3 Transport Layer 39Comp 361 Spring 2005

                                                      rdt30 in action

                                                      3 Transport Layer 40Comp 361 Spring 2005

                                                      Performance of rdt30

                                                      rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                      L (packet length in bits)R (transmission rate bps)

                                                      8kbpkt109 bsec

                                                      Ttransmit = = = 8 microsec

                                                      U sender =

                                                      00830008

                                                      = 000027 L R RTT + L R

                                                      =

                                                      U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                      rdt30 stop-and-wait operation

                                                      first packet bit transmitted t = 0

                                                      sender receiver

                                                      RTT

                                                      last packet bit transmitted t = L R

                                                      first packet bit arriveslast packet bit arrives send ACK

                                                      ACK arrives send next packet t = RTT + L R

                                                      U sender =

                                                      008 30008

                                                      = 000027 L R RTT + L R

                                                      =

                                                      3 Transport Layer 41Comp 361 Spring 2005

                                                      3 Transport Layer 42Comp 361 Spring 2005

                                                      Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                      range of sequence numbers must be increasedbuffering at sender andor receiver

                                                      3 Transport Layer 43Comp 361 Spring 2005

                                                      Pipelined protocols

                                                      Advantage much better bandwidth utilization than stop-and-wait

                                                      Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                      Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                      Note TCP is not exactly either

                                                      Pipelining increased utilization

                                                      first packet bit transmitted t = 0

                                                      sender receiver

                                                      RTT

                                                      last bit transmitted t = L R

                                                      first packet bit arriveslast packet bit arrives send ACK

                                                      ACK arrives send next packet t = RTT + L R

                                                      last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                      U sender =

                                                      02430008

                                                      = 00008 3 L R RTT + L R

                                                      =

                                                      Increase utilizationby a factor of 3

                                                      3 Transport Layer 44Comp 361 Spring 2005

                                                      3 Transport Layer 45Comp 361 Spring 2005

                                                      Go-Back-NSender

                                                      k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                      ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                      Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                      3 Transport Layer 46Comp 361 Spring 2005

                                                      GBN Sender

                                                      rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                      Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                      Timeout resends ALL packets that have been sent but not yet acknowledged

                                                      This is only event that triggers resend

                                                      3 Transport Layer 47Comp 361 Spring 2005

                                                      GBN sender extended FSMrdt_send(data)

                                                      Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                      timeout

                                                      if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                      start_timernextseqnum++

                                                      elserefuse_data(data)

                                                      base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                      stop_timerelse

                                                      start_timer

                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                      base=1nextseqnum=1

                                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                      Λ

                                                      3 Transport Layer 48Comp 361 Spring 2005

                                                      GBN receiver extended FSM

                                                      Wait

                                                      udt_send(sndpkt)default

                                                      rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                      expectedseqnum=1sndpkt =

                                                      make_pkt(0ACKchksum)

                                                      Λ

                                                      If expected packet receivedSend ACK and deliver packet upstairs

                                                      If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                      3 Transport Layer 49Comp 361 Spring 2005

                                                      More on receiver

                                                      The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                      3 Transport Layer 50Comp 361 Spring 2005

                                                      GBN inaction

                                                      GBN is easy to code but might have performance problems

                                                      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                      3 Transport Layer 51Comp 361 Spring 2005

                                                      3 Transport Layer 52Comp 361 Spring 2005

                                                      Selective Repeat

                                                      receiver individually acknowledges all correctly received pkts

                                                      buffers pkts as needed for eventual in-order delivery to upper layer

                                                      sender only resends pkts for which ACK not received

                                                      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                      3 Transport Layer 53Comp 361 Spring 2005

                                                      Selective repeat sender receiver windows

                                                      3 Transport Layer 54Comp 361 Spring 2005

                                                      Selective repeat

                                                      pkt n in [rcvbase rcvbase+N-1]

                                                      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                      pkt n in [rcvbase-Nrcvbase-1]

                                                      ACK(n) (note this is a reACK)

                                                      otherwiseignore

                                                      receiverdata from above

                                                      if next available seq in window send pkt

                                                      timeout(n)resend pkt n restart timer

                                                      ACK(n) in [sendbasesendbase+N]

                                                      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                      sender

                                                      3 Transport Layer 55Comp 361 Spring 2005

                                                      Selective repeat in action

                                                      3 Transport Layer 56Comp 361 Spring 2005

                                                      Selective repeatdilemma

                                                      Example seq rsquos 0 1 2 3window size=3

                                                      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                      Q what is relationship between seq size and window size

                                                      3 Transport Layer 57Comp 361 Spring 2005

                                                      Chapter 3 outline

                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                      35 Connection-oriented transport TCP

                                                      segment structurereliable data transferflow controlconnection management

                                                      36 Principles of congestion control37 TCP congestion control

                                                      3 Transport Layer 58Comp 361 Spring 2005

                                                      TCP Overview RFCs 793 1122 1323 2018 2581

                                                      full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                      flow controlledsender will not overwhelm receiver

                                                      point-to-pointone sender one receiver

                                                      reliable in-order byte steam

                                                      no ldquomessage boundariesrdquopipelined

                                                      TCP congestion and flow control set window size

                                                      send amp receive buffers

                                                      socketdoor

                                                      TCPsend buffer

                                                      TCPreceive buffer

                                                      socketdoor

                                                      segment

                                                      applicationwrites data

                                                      applicationreads data

                                                      3 Transport Layer 59Comp 361 Spring 2005

                                                      More TCP DetailsMaximum Segment Size (MSS)

                                                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                      Application Data + TCP Header = TCP Segment

                                                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                      (again no payload)Client responds with third special segment

                                                      This can contain payload

                                                      3 Transport Layer 60Comp 361 Spring 2005

                                                      Even More TCP Details

                                                      A TCP connection between client and server creates in both client and server

                                                      (i) buffers(ii) variables and

                                                      (iii) a socket connection to process

                                                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                      any of the network elements between the host and server

                                                      3 Transport Layer 61Comp 361 Spring 2005

                                                      TCP segment structure

                                                      source port dest port

                                                      32 bits

                                                      applicationdata

                                                      (variable length)

                                                      sequence numberacknowledgement number

                                                      Receive windowUrg data pnterchecksum

                                                      FSRPAUheadlen

                                                      notused

                                                      Options (variable length)

                                                      URG urgent data (generally not used)

                                                      ACK ACK valid

                                                      PSH push data now(generally not used)

                                                      RST SYN FINconnection estab(setup teardown

                                                      commands)

                                                      bytes rcvr willingto accept

                                                      Internetchecksum

                                                      (as in UDP)

                                                      countingby bytes of data(not segments)

                                                      3 Transport Layer 62Comp 361 Spring 2005

                                                      TCP seq rsquos and ACKsSeq rsquos

                                                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                      ACKsseq of next byte expected from other sidecumulative ACK

                                                      Q how receiver handles out-of-order segments

                                                      A TCP spec doesnrsquot say - up to implementer

                                                      Host BHost A

                                                      Seq=42 ACK=79 data = lsquoCrsquo

                                                      Seq=79 ACK=43 data = lsquoCrsquo

                                                      Seq=43 ACK=80

                                                      Usertypes

                                                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                      back lsquoCrsquo

                                                      host ACKsreceipt

                                                      of echoedlsquoCrsquo

                                                      timesimple telnet scenario

                                                      3 Transport Layer 63Comp 361 Spring 2005

                                                      TCP Round Trip Time and Timeout

                                                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                      average several recent measurements not just current SampleRTT

                                                      Q how to set TCP timeout valuelonger than RTT

                                                      but RTT variestoo short premature timeout

                                                      unnecessary retransmissions

                                                      too long slow reaction to segment loss

                                                      3 Transport Layer 64Comp 361 Spring 2005

                                                      TCP Round Trip Time and Timeout

                                                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                      3 Transport Layer 65Comp 361 Spring 2005

                                                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                      100

                                                      150

                                                      200

                                                      250

                                                      300

                                                      350

                                                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                      time (seconnds)

                                                      RTT

                                                      (mill

                                                      iseco

                                                      nds)

                                                      SampleRTT Estimated RTT

                                                      3 Transport Layer 66Comp 361 Spring 2005

                                                      TCP Round Trip Time and Timeout

                                                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                      (typically β = 025)

                                                      Then set timeout interval

                                                      TimeoutInterval = EstimatedRTT + 4DevRTT

                                                      3 Transport Layer 67Comp 361 Spring 2005

                                                      Chapter 3 outline

                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                      35 Connection-oriented transport TCP

                                                      segment structurereliable data transferflow controlconnection management

                                                      36 Principles of congestion control37 TCP congestion control

                                                      3 Transport Layer 68Comp 361 Spring 2005

                                                      TCP reliable data transfer

                                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                      Retransmissions are triggered by

                                                      timeout eventsduplicate acks

                                                      Initially consider simplified TCP sender

                                                      ignore duplicate acksignore flow control congestion control

                                                      3 Transport Layer 69Comp 361 Spring 2005

                                                      TCP sender eventsdata rcvd from app

                                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                      timeoutretransmit segment that caused timeoutrestart timer

                                                      Ack rcvdIf acknowledges previously unackedsegments

                                                      update what is known to be ackedstart timer if there are outstanding segments

                                                      TCP sender(simplified)

                                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                      loop (forever) switch(event)

                                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                                      smallest sequence numberstart timer

                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                      start timer

                                                      end of loop forever

                                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                      3 Transport Layer 70Comp 361 Spring 2005

                                                      3 Transport Layer 71Comp 361 Spring 2005

                                                      TCP retransmission scenariosHost A

                                                      Seq=100 20 bytes data

                                                      ACK=100

                                                      timepremature timeout

                                                      Host B

                                                      Seq=92 8 bytes data

                                                      ACK=120

                                                      Seq=92 8 bytes data

                                                      Seq=

                                                      92 t

                                                      imeo

                                                      ut

                                                      ACK=120

                                                      Host A

                                                      Seq=92 8 bytes data

                                                      ACK=100

                                                      loss

                                                      tim

                                                      eout

                                                      lost ACK scenario

                                                      Host B

                                                      X

                                                      Seq=92 8 bytes data

                                                      ACK=100

                                                      time

                                                      SendBase= 120

                                                      SendBase= 120

                                                      Sendbase= 100

                                                      Seq=

                                                      92 t

                                                      imeo

                                                      utSendBase

                                                      = 100

                                                      3 Transport Layer 72Comp 361 Spring 2005

                                                      TCP retransmission scenarios (more)Host A

                                                      Seq=92 8 bytes data

                                                      ACK=100

                                                      loss

                                                      tim

                                                      eout

                                                      Cumulative ACK scenario

                                                      Host B

                                                      X

                                                      Seq=100 20 bytes data

                                                      ACK=120

                                                      time

                                                      SendBase= 120

                                                      3 Transport Layer 73Comp 361 Spring 2005

                                                      TCP ACK generation [RFC 1122 RFC 2581]

                                                      Event at Receiver

                                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                      Arrival of segment that partially or completely fills gap

                                                      TCP Receiver action

                                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                      Immediately send single cumulative ACK ACKing both in-order segments

                                                      Immediately send duplicate ACK indicating seq of next expected byte

                                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                                      3 Transport Layer 74Comp 361 Spring 2005

                                                      More on Sender Policies

                                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                      Fast Retransmit

                                                      Time-out period often relatively long

                                                      long delay before resending lost packet

                                                      Detect lost segments via duplicate ACKs

                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                      fast retransmit resend segment before timer expires

                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                      Fast retransmit algorithm

                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                      start timer

                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                      resend segment with sequence number y

                                                      a duplicate ACK for already ACKed segment

                                                      fast retransmit

                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                      TCP GBN or Selective Repeat

                                                      Basic TCP looks a lot like GBN

                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                      This looks a lot like Selective Repeat

                                                      TCP is a hybrid

                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                      Chapter 3 outline

                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                      35 Connection-oriented transport TCP

                                                      segment structurereliable data transferflow controlconnection management

                                                      36 Principles of congestion control37 TCP congestion control

                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                      TCP Flow Control

                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                      transmitting too muchtoo fast

                                                      flow controlreceive side of TCP connection has a receive buffer

                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                      app process may be slow at reading from buffer

                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                      TCP segment structure

                                                      source port dest port

                                                      32 bits

                                                      applicationdata

                                                      (variable length)

                                                      sequence numberacknowledgement number

                                                      Receive windowUrg data pnterchecksum

                                                      FSRPAUheadlen

                                                      notused

                                                      Options (variable length)

                                                      URG urgent data (generally not used)

                                                      ACK ACK valid

                                                      PSH push data now(generally not used)

                                                      RST SYN FINconnection estab(setup teardown

                                                      commands)

                                                      bytes rcvr willingto accept

                                                      Internetchecksum

                                                      (as in UDP)

                                                      countingby bytes of data(not segments)

                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                      TCP Flow control how it works

                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                      LastByteRead]

                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                      guarantees receive buffer doesnrsquot overflow

                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                      Technical Issue

                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                      Note on UDP

                                                      UDP has no flow control

                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                      Chapter 3 outline

                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                      35 Connection-oriented transport TCP

                                                      segment structurereliable data transferflow controlconnection management

                                                      36 Principles of congestion control37 TCP congestion control

                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                      TCP Connection Management

                                                      Three way handshakeStep 1 client end system sends

                                                      TCP SYN control segment to server

                                                      specifies client_isn the initial seq No application data

                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                      seq sbuffers flow control info (eg RcvWindow)

                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                      TCP Connection Management (cont)

                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                      Allocate buffersAllocates buffersCan include application data

                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                      clientConnection request (SYN=1 seq=client_isn)

                                                      server

                                                      Connection granted (SYN=1 server_isn

                                                      ACK (SYN=0 seq=client_isn+1)

                                                      ack=client_isn+1)

                                                      ack=server_isn+1

                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                      TCP Connection Management (cont)

                                                      Closing a connection

                                                      client closes socketclientSocketclose()

                                                      Step 1 client end system sends TCP FIN control segment to server

                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                      client

                                                      FIN

                                                      server

                                                      ACK

                                                      ACK

                                                      FIN

                                                      close

                                                      close

                                                      closed

                                                      tim

                                                      ed w

                                                      ait

                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                      TCP Connection Management (cont)

                                                      Step 3 client receives FIN replies with ACK

                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                      Closes down after timed-wait

                                                      Step 4 server receives ACK Connection closed

                                                      Note with small modification can handle simultaneous FINs

                                                      client

                                                      FIN

                                                      server

                                                      ACK

                                                      ACK

                                                      FIN

                                                      closing

                                                      closing

                                                      closed

                                                      tim

                                                      ed w

                                                      ait

                                                      closed

                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                      TCP Connection Management (cont)

                                                      ExampleTCP serverlifecycle

                                                      Example TCP clientlifecycle

                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                      A few special cases

                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                      Chapter 3 outline

                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                      35 Connection-oriented transport TCP

                                                      segment structurereliable data transferflow controlconnection management

                                                      36 Principles of congestion control37 TCP congestion control

                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                      Principles of Congestion Control

                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                      a top-10 problem

                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                      large delays when congestedmaximum achievable throughput

                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                      Causescosts of congestion scenario 2

                                                      one router finite buffers sender retransmission of lost packet

                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                      λin λout=

                                                      λin λoutgtλ

                                                      inλout

                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                      (c)(a) (b)

                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                      λin

                                                      Q what happens as and increase λ

                                                      in

                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                      Causescosts of congestion scenario 3

                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                      Approaches towards congestion control

                                                      Two broad approaches towards congestion control

                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                      Case study ATM ABR congestion control

                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                      RM cells returned to sender by receiver with bits intact

                                                      small exception ndash see next page

                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                      sender should use available bandwidth

                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                      Case study ATM ABR congestion control

                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                      Chapter 3 outline

                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                      35 Connection-oriented transport TCP

                                                      segment structurereliable data transferflow controlconnection management

                                                      36 Principles of congestion control37 TCP congestion control

                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                      Congwin

                                                      w segments each with MSS bytes sent in one RTT

                                                      throughput = w MSSRTT Bytessec

                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                      LastByteSent-LastByteAcked le CongWin

                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                      cut CongWin in half after loss event

                                                      8 Kbytes

                                                      16 Kbytes

                                                      24 Kbytes

                                                      time

                                                      congestionwindow

                                                      Long-lived TCP connection

                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                      TCP Slow Start

                                                      When connection begins CongWin = 1 MSS

                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                      available bandwidth may be gtgt MSSRTT

                                                      desirable to quickly ramp up to respectable rate

                                                      When connection begins increase rate exponentially fast until first loss event

                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                      TCP Slow Start (more)

                                                      When connection begins increase rate exponentially until first loss event

                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                      Summary initial rate is slow but ramps up exponentially fast

                                                      Host A

                                                      one segment

                                                      RTT

                                                      Host B

                                                      time

                                                      two segments

                                                      four segments

                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                      Summary TCP Congestion Control

                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                      The Big Picture

                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                      ACK receipt for previously unackeddata

                                                      Slow Start (SS)

                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                      set state to ldquoCongestion Avoidancerdquo

                                                      Resulting in a doubling of CongWin every RTT

                                                      ACK receipt for previously unackeddata

                                                      CongestionAvoidance (CA)

                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                      Loss event detected by triple duplicate ACK

                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                      Enter slow start

                                                      Duplicate ACK

                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                      CongWin and Threshold not changed

                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                      TCP throughput

                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                      TCP Futures

                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                      LRTTMSSsdot221

                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                      TCP connection 1

                                                      bottleneckrouter

                                                      capacity R

                                                      TCP connection 2

                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                      Why is TCP fairTwo competing sessions

                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                      R

                                                      R

                                                      equal bandwidth share

                                                      Connection 1 throughput

                                                      Conn

                                                      ecti

                                                      on 2

                                                      thr

                                                      ough

                                                      p ut

                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                      Fairness (more)Fairness and UDP

                                                      Multimedia apps often do not use TCP

                                                      do not want rate throttled by congestion control

                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                      Current Research area How to keep UDP from congesting the internet

                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                      TCP Latency ModelingNotation assumptions

                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                      modeling slow start

                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                      Fixed Congestion Window (W)Two cases

                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                      Fixed congestion window (1)

                                                      First caseWSR gt RTT + SR ACK for

                                                      first segment in window returns before windowrsquos worth of data sent

                                                      latency = 2RTT + OR

                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                      Fixed congestion window (2)

                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                      TCP Latency Modeling Slow Start (1)

                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                      Will show that the delay for one object is

                                                      RS

                                                      RSRTTP

                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                      ⎤⎢⎣⎡ +++=

                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                      - and K is the number of windows that cover the object

                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                      TCP Latency Modeling Slow Start (2)

                                                      RTT

                                                      initiate TCPconnection

                                                      requestobject

                                                      first window= SR

                                                      second window= 2SR

                                                      third window= 4SR

                                                      fourth window= 8SR

                                                      completetransmissionobject

                                                      delivered

                                                      time atclient

                                                      time atserver

                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                      Server idles P=2 times

                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                      Server idles P = minK-1Q times

                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                      TCP Latency Modeling (3)

                                                      ementacknowledg receivesserver until

                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                      RS

                                                      RSRTTPRTT

                                                      RO

                                                      RSRTT

                                                      RSRTT

                                                      RO

                                                      idleTimeRTTRO

                                                      P

                                                      kP

                                                      k

                                                      P

                                                      pp

                                                      )12(][2

                                                      ]2[2

                                                      2delay

                                                      1

                                                      1

                                                      1

                                                      minusminus+++=

                                                      minus+++=

                                                      ++=

                                                      minus

                                                      =

                                                      =

                                                      sum

                                                      sum

                                                      th window after the timeidle 2 1 kRSRTT

                                                      RS k =⎥⎦

                                                      ⎤⎢⎣⎡ minus+

                                                      +minus

                                                      window kth the transmit totime2 1 =minus

                                                      RSk

                                                      RTT

                                                      initiate TCPconnection

                                                      requestobject

                                                      first window= SR

                                                      second window= 2SR

                                                      third window= 4SR

                                                      fourth window= 8SR

                                                      completetransmissionobject

                                                      delivered

                                                      time atclient

                                                      time atserver

                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                      How do we calculate K

                                                      ⎥⎥⎤

                                                      ⎢⎢⎡ +=

                                                      +ge=

                                                      geminus=

                                                      ge+++=

                                                      ge+++=minus

                                                      minus

                                                      )1(log

                                                      )1(logmin

                                                      12min

                                                      222min222min

                                                      2

                                                      2

                                                      110

                                                      110

                                                      SO

                                                      SOkk

                                                      SOk

                                                      SOkOSSSkK

                                                      k

                                                      k

                                                      k

                                                      L

                                                      L

                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                      HTTP ModelingAssume Web page consists of

                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                      02468

                                                      101214161820

                                                      28Kbps

                                                      100Kbps

                                                      1 Mbps 10Mbps

                                                      non-persistent

                                                      persistent

                                                      parallel non-persistent

                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                      HTTP Response time (in seconds)

                                                      0

                                                      10

                                                      20

                                                      30

                                                      40

                                                      50

                                                      60

                                                      70

                                                      28Kbps

                                                      100Kbps

                                                      1 Mbps 10Mbps

                                                      non-persistent

                                                      persistent

                                                      parallel non-persistent

                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                      instantiation and implementation in the Internet

                                                      UDPTCP

                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                      • Chapter 3 Transport Layer last revised 160305
                                                      • Chapter 3 outline
                                                      • Transport services and protocols
                                                      • Transport vs network layer
                                                      • Transport-layer protocols
                                                      • Chapter 3 outline
                                                      • Multiplexingdemultiplexing
                                                      • Multiplexingdemultiplexing
                                                      • How demultiplexing works
                                                      • Connectionless demultiplexing
                                                      • Connectionless demux (cont)
                                                      • Connection-oriented demux
                                                      • Connection-oriented demux (cont)
                                                      • Connection-oriented demux Threaded Web Server
                                                      • Chapter 3 outline
                                                      • UDP User Datagram Protocol [RFC 768]
                                                      • UDP more
                                                      • UDP checksum
                                                      • Chapter 3 outline
                                                      • Principles of Reliable data transfer
                                                      • Reliable data transfer getting started
                                                      • Reliable data transfer getting started
                                                      • Incremental Improvements
                                                      • Rdt10 reliable transfer over a reliable channel
                                                      • Rdt20 channel with bit errors
                                                      • rdt20 FSM specification
                                                      • rdt20 operation with no errors
                                                      • rdt20 error scenario
                                                      • rdt20 has a fatal flaw
                                                      • rdt21 sender handles garbled ACKNAKs
                                                      • rdt21 receiver handles garbled ACKNAKs
                                                      • rdt21 discussion
                                                      • rdt22 a NAK-free protocol
                                                      • rdt22 sender receiver fragments
                                                      • rdt30 channels with errors and loss
                                                      • rdt30 sender
                                                      • rdt30 in action
                                                      • rdt30 in action
                                                      • Performance of rdt30
                                                      • rdt30 stop-and-wait operation
                                                      • Pipelined protocols
                                                      • Pipelined protocols
                                                      • Pipelining increased utilization
                                                      • Go-Back-N
                                                      • GBN Sender
                                                      • GBN sender extended FSM
                                                      • GBN receiver extended FSM
                                                      • More on receiver
                                                      • GBN inaction
                                                      • Selective Repeat
                                                      • Selective repeat sender receiver windows
                                                      • Selective repeat
                                                      • Selective repeat in action
                                                      • Selective repeat dilemma
                                                      • Chapter 3 outline
                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                      • More TCP Details
                                                      • Even More TCP Details
                                                      • TCP segment structure
                                                      • TCP seq rsquos and ACKs
                                                      • TCP Round Trip Time and Timeout
                                                      • TCP Round Trip Time and Timeout
                                                      • Example RTT estimation
                                                      • TCP Round Trip Time and Timeout
                                                      • Chapter 3 outline
                                                      • TCP reliable data transfer
                                                      • TCP sender events
                                                      • TCP sender(simplified)
                                                      • TCP retransmission scenarios
                                                      • TCP retransmission scenarios (more)
                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                      • More on Sender Policies
                                                      • Fast Retransmit
                                                      • Fast retransmit algorithm
                                                      • TCP GBN or Selective Repeat
                                                      • Chapter 3 outline
                                                      • TCP Flow Control
                                                      • TCP Flow Control
                                                      • TCP segment structure
                                                      • TCP Flow control how it works
                                                      • Technical Issue
                                                      • Chapter 3 outline
                                                      • TCP Connection Management
                                                      • TCP Connection Management (cont)
                                                      • TCP Connection Management (cont)
                                                      • TCP Connection Management (cont)
                                                      • TCP Connection Management (cont)
                                                      • A few special cases
                                                      • Chapter 3 outline
                                                      • Principles of Congestion Control
                                                      • Causescosts of congestion scenario 1
                                                      • Causescosts of congestion scenario 2
                                                      • Causescosts of congestion scenario 3
                                                      • Causescosts of congestion scenario 3
                                                      • Approaches towards congestion control
                                                      • Case study ATM ABR congestion control
                                                      • Case study ATM ABR congestion control
                                                      • Chapter 3 outline
                                                      • TCP Congestion Control
                                                      • TCP AIMD
                                                      • TCP Slow Start
                                                      • TCP Slow Start (more)
                                                      • Summary TCP Congestion Control
                                                      • The Big Picture
                                                      • TCP sender congestion control
                                                      • TCP throughput
                                                      • TCP Futures
                                                      • TCP Fairness
                                                      • Why is TCP fair
                                                      • Fairness (more)
                                                      • TCP Latency Modeling
                                                      • Fixed Congestion Window (W)
                                                      • Fixed congestion window (1)
                                                      • Fixed congestion window (2)
                                                      • TCP Latency Modeling Slow Start (1)
                                                      • TCP Latency Modeling Slow Start (2)
                                                      • TCP Latency Modeling (3)
                                                      • TCP Latency Modeling (4)
                                                      • HTTP Modeling
                                                      • Chapter 3 Summary

                                                        3 Transport Layer 28Comp 361 Spring 2005

                                                        rdt20 error scenario

                                                        Wait for call from above

                                                        snkpkt = make_pkt(data checksum)udt_send(sndpkt)

                                                        extract(rcvpktdata)deliver_data(data)udt_send(ACK)

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                        rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)

                                                        udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampampisNAK(rcvpkt)

                                                        udt_send(NAK)

                                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                        Wait for ACK or

                                                        NAK

                                                        Wait for call from

                                                        below

                                                        rdt_send(data)

                                                        Λ

                                                        3 Transport Layer 29Comp 361 Spring 2005

                                                        rdt20 has a fatal flawWhat happens if ACKNAK

                                                        corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                                        What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                                        Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                                        Sender sends one packet then waits for receiver response

                                                        stop and wait

                                                        3 Transport Layer 30Comp 361 Spring 2005

                                                        Sender whenever sender receives control message it sends a packet to receiver

                                                        A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                        Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                        Note ACKNAK do not contain sequence

                                                        3 Transport Layer 31Comp 361 Spring 2005

                                                        rdt21 sender handles garbled ACKNAKs

                                                        Wait for call 0 from

                                                        above

                                                        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                        rdt_send(data)

                                                        Wait for ACK or NAK 0 udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                        rdt_send(data)

                                                        udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                        Wait forcall 1 from

                                                        above

                                                        Wait for ACK or NAK 1

                                                        ΛΛ

                                                        3 Transport Layer 32Comp 361 Spring 2005

                                                        rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                        ampamp has_seq0(rcvpkt)

                                                        Wait for 0 from below

                                                        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                        Wait for 1 from below

                                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                        sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                        sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                        3 Transport Layer 33Comp 361 Spring 2005

                                                        rdt21 discussion

                                                        Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                        state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                        Receivermust check if received packet is duplicate

                                                        state indicates whether 0 or 1 is expected pkt seq

                                                        note receiver can notknow if its last ACKNAK received OK at sender

                                                        3 Transport Layer 34Comp 361 Spring 2005

                                                        rdt22 a NAK-free protocol

                                                        same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                        receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                        duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                        3 Transport Layer 35Comp 361 Spring 2005

                                                        rdt22 sender receiver fragments

                                                        Wait for call 0 from

                                                        above

                                                        sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                        rdt_send(data)

                                                        udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                        isACK(rcvpkt1) )

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                        Wait for ACK

                                                        0sender FSM

                                                        fragment

                                                        Wait for 0 from below

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                        rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                        has_seq1(rcvpkt))

                                                        udt_send(sndpkt)receiver FSM

                                                        fragment

                                                        Λ

                                                        3 Transport Layer 36Comp 361 Spring 2005

                                                        rdt30 channels with errors and loss

                                                        New assumptionunderlying channel can also lose packets (data or ACKs)

                                                        checksum seq ACKs retransmissions will be of help but not enough

                                                        Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                        Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                        retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                        requires countdown timer

                                                        3 Transport Layer 37Comp 361 Spring 2005

                                                        rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                        rdt_send(data)

                                                        Wait for

                                                        ACK0

                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                        Wait for call 1 from

                                                        above

                                                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                        rdt_send(data)

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                        stop_timerstop_timer

                                                        udt_send(sndpkt)start_timer

                                                        timeout

                                                        udt_send(sndpkt)start_timer

                                                        timeout

                                                        rdt_rcv(rcvpkt)

                                                        Wait for call 0from

                                                        above

                                                        Wait for

                                                        ACK1

                                                        Λrdt_rcv(rcvpkt)

                                                        ΛΛ

                                                        Λ

                                                        3 Transport Layer 38Comp 361 Spring 2005

                                                        rdt30 in action

                                                        3 Transport Layer 39Comp 361 Spring 2005

                                                        rdt30 in action

                                                        3 Transport Layer 40Comp 361 Spring 2005

                                                        Performance of rdt30

                                                        rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                        L (packet length in bits)R (transmission rate bps)

                                                        8kbpkt109 bsec

                                                        Ttransmit = = = 8 microsec

                                                        U sender =

                                                        00830008

                                                        = 000027 L R RTT + L R

                                                        =

                                                        U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                        rdt30 stop-and-wait operation

                                                        first packet bit transmitted t = 0

                                                        sender receiver

                                                        RTT

                                                        last packet bit transmitted t = L R

                                                        first packet bit arriveslast packet bit arrives send ACK

                                                        ACK arrives send next packet t = RTT + L R

                                                        U sender =

                                                        008 30008

                                                        = 000027 L R RTT + L R

                                                        =

                                                        3 Transport Layer 41Comp 361 Spring 2005

                                                        3 Transport Layer 42Comp 361 Spring 2005

                                                        Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                        range of sequence numbers must be increasedbuffering at sender andor receiver

                                                        3 Transport Layer 43Comp 361 Spring 2005

                                                        Pipelined protocols

                                                        Advantage much better bandwidth utilization than stop-and-wait

                                                        Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                        Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                        Note TCP is not exactly either

                                                        Pipelining increased utilization

                                                        first packet bit transmitted t = 0

                                                        sender receiver

                                                        RTT

                                                        last bit transmitted t = L R

                                                        first packet bit arriveslast packet bit arrives send ACK

                                                        ACK arrives send next packet t = RTT + L R

                                                        last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                        U sender =

                                                        02430008

                                                        = 00008 3 L R RTT + L R

                                                        =

                                                        Increase utilizationby a factor of 3

                                                        3 Transport Layer 44Comp 361 Spring 2005

                                                        3 Transport Layer 45Comp 361 Spring 2005

                                                        Go-Back-NSender

                                                        k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                        ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                        Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                        3 Transport Layer 46Comp 361 Spring 2005

                                                        GBN Sender

                                                        rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                        Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                        Timeout resends ALL packets that have been sent but not yet acknowledged

                                                        This is only event that triggers resend

                                                        3 Transport Layer 47Comp 361 Spring 2005

                                                        GBN sender extended FSMrdt_send(data)

                                                        Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                        timeout

                                                        if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                        start_timernextseqnum++

                                                        elserefuse_data(data)

                                                        base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                        stop_timerelse

                                                        start_timer

                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                        base=1nextseqnum=1

                                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                        Λ

                                                        3 Transport Layer 48Comp 361 Spring 2005

                                                        GBN receiver extended FSM

                                                        Wait

                                                        udt_send(sndpkt)default

                                                        rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                        expectedseqnum=1sndpkt =

                                                        make_pkt(0ACKchksum)

                                                        Λ

                                                        If expected packet receivedSend ACK and deliver packet upstairs

                                                        If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                        3 Transport Layer 49Comp 361 Spring 2005

                                                        More on receiver

                                                        The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                        3 Transport Layer 50Comp 361 Spring 2005

                                                        GBN inaction

                                                        GBN is easy to code but might have performance problems

                                                        In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                        Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                        3 Transport Layer 51Comp 361 Spring 2005

                                                        3 Transport Layer 52Comp 361 Spring 2005

                                                        Selective Repeat

                                                        receiver individually acknowledges all correctly received pkts

                                                        buffers pkts as needed for eventual in-order delivery to upper layer

                                                        sender only resends pkts for which ACK not received

                                                        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                        3 Transport Layer 53Comp 361 Spring 2005

                                                        Selective repeat sender receiver windows

                                                        3 Transport Layer 54Comp 361 Spring 2005

                                                        Selective repeat

                                                        pkt n in [rcvbase rcvbase+N-1]

                                                        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                        pkt n in [rcvbase-Nrcvbase-1]

                                                        ACK(n) (note this is a reACK)

                                                        otherwiseignore

                                                        receiverdata from above

                                                        if next available seq in window send pkt

                                                        timeout(n)resend pkt n restart timer

                                                        ACK(n) in [sendbasesendbase+N]

                                                        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                        sender

                                                        3 Transport Layer 55Comp 361 Spring 2005

                                                        Selective repeat in action

                                                        3 Transport Layer 56Comp 361 Spring 2005

                                                        Selective repeatdilemma

                                                        Example seq rsquos 0 1 2 3window size=3

                                                        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                        Q what is relationship between seq size and window size

                                                        3 Transport Layer 57Comp 361 Spring 2005

                                                        Chapter 3 outline

                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                        35 Connection-oriented transport TCP

                                                        segment structurereliable data transferflow controlconnection management

                                                        36 Principles of congestion control37 TCP congestion control

                                                        3 Transport Layer 58Comp 361 Spring 2005

                                                        TCP Overview RFCs 793 1122 1323 2018 2581

                                                        full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                        flow controlledsender will not overwhelm receiver

                                                        point-to-pointone sender one receiver

                                                        reliable in-order byte steam

                                                        no ldquomessage boundariesrdquopipelined

                                                        TCP congestion and flow control set window size

                                                        send amp receive buffers

                                                        socketdoor

                                                        TCPsend buffer

                                                        TCPreceive buffer

                                                        socketdoor

                                                        segment

                                                        applicationwrites data

                                                        applicationreads data

                                                        3 Transport Layer 59Comp 361 Spring 2005

                                                        More TCP DetailsMaximum Segment Size (MSS)

                                                        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                        Application Data + TCP Header = TCP Segment

                                                        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                        (again no payload)Client responds with third special segment

                                                        This can contain payload

                                                        3 Transport Layer 60Comp 361 Spring 2005

                                                        Even More TCP Details

                                                        A TCP connection between client and server creates in both client and server

                                                        (i) buffers(ii) variables and

                                                        (iii) a socket connection to process

                                                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                        any of the network elements between the host and server

                                                        3 Transport Layer 61Comp 361 Spring 2005

                                                        TCP segment structure

                                                        source port dest port

                                                        32 bits

                                                        applicationdata

                                                        (variable length)

                                                        sequence numberacknowledgement number

                                                        Receive windowUrg data pnterchecksum

                                                        FSRPAUheadlen

                                                        notused

                                                        Options (variable length)

                                                        URG urgent data (generally not used)

                                                        ACK ACK valid

                                                        PSH push data now(generally not used)

                                                        RST SYN FINconnection estab(setup teardown

                                                        commands)

                                                        bytes rcvr willingto accept

                                                        Internetchecksum

                                                        (as in UDP)

                                                        countingby bytes of data(not segments)

                                                        3 Transport Layer 62Comp 361 Spring 2005

                                                        TCP seq rsquos and ACKsSeq rsquos

                                                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                        ACKsseq of next byte expected from other sidecumulative ACK

                                                        Q how receiver handles out-of-order segments

                                                        A TCP spec doesnrsquot say - up to implementer

                                                        Host BHost A

                                                        Seq=42 ACK=79 data = lsquoCrsquo

                                                        Seq=79 ACK=43 data = lsquoCrsquo

                                                        Seq=43 ACK=80

                                                        Usertypes

                                                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                        back lsquoCrsquo

                                                        host ACKsreceipt

                                                        of echoedlsquoCrsquo

                                                        timesimple telnet scenario

                                                        3 Transport Layer 63Comp 361 Spring 2005

                                                        TCP Round Trip Time and Timeout

                                                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                        average several recent measurements not just current SampleRTT

                                                        Q how to set TCP timeout valuelonger than RTT

                                                        but RTT variestoo short premature timeout

                                                        unnecessary retransmissions

                                                        too long slow reaction to segment loss

                                                        3 Transport Layer 64Comp 361 Spring 2005

                                                        TCP Round Trip Time and Timeout

                                                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                        3 Transport Layer 65Comp 361 Spring 2005

                                                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                        100

                                                        150

                                                        200

                                                        250

                                                        300

                                                        350

                                                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                        time (seconnds)

                                                        RTT

                                                        (mill

                                                        iseco

                                                        nds)

                                                        SampleRTT Estimated RTT

                                                        3 Transport Layer 66Comp 361 Spring 2005

                                                        TCP Round Trip Time and Timeout

                                                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                        (typically β = 025)

                                                        Then set timeout interval

                                                        TimeoutInterval = EstimatedRTT + 4DevRTT

                                                        3 Transport Layer 67Comp 361 Spring 2005

                                                        Chapter 3 outline

                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                        35 Connection-oriented transport TCP

                                                        segment structurereliable data transferflow controlconnection management

                                                        36 Principles of congestion control37 TCP congestion control

                                                        3 Transport Layer 68Comp 361 Spring 2005

                                                        TCP reliable data transfer

                                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                        Retransmissions are triggered by

                                                        timeout eventsduplicate acks

                                                        Initially consider simplified TCP sender

                                                        ignore duplicate acksignore flow control congestion control

                                                        3 Transport Layer 69Comp 361 Spring 2005

                                                        TCP sender eventsdata rcvd from app

                                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                        timeoutretransmit segment that caused timeoutrestart timer

                                                        Ack rcvdIf acknowledges previously unackedsegments

                                                        update what is known to be ackedstart timer if there are outstanding segments

                                                        TCP sender(simplified)

                                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                        loop (forever) switch(event)

                                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                                        smallest sequence numberstart timer

                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                        start timer

                                                        end of loop forever

                                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                        3 Transport Layer 70Comp 361 Spring 2005

                                                        3 Transport Layer 71Comp 361 Spring 2005

                                                        TCP retransmission scenariosHost A

                                                        Seq=100 20 bytes data

                                                        ACK=100

                                                        timepremature timeout

                                                        Host B

                                                        Seq=92 8 bytes data

                                                        ACK=120

                                                        Seq=92 8 bytes data

                                                        Seq=

                                                        92 t

                                                        imeo

                                                        ut

                                                        ACK=120

                                                        Host A

                                                        Seq=92 8 bytes data

                                                        ACK=100

                                                        loss

                                                        tim

                                                        eout

                                                        lost ACK scenario

                                                        Host B

                                                        X

                                                        Seq=92 8 bytes data

                                                        ACK=100

                                                        time

                                                        SendBase= 120

                                                        SendBase= 120

                                                        Sendbase= 100

                                                        Seq=

                                                        92 t

                                                        imeo

                                                        utSendBase

                                                        = 100

                                                        3 Transport Layer 72Comp 361 Spring 2005

                                                        TCP retransmission scenarios (more)Host A

                                                        Seq=92 8 bytes data

                                                        ACK=100

                                                        loss

                                                        tim

                                                        eout

                                                        Cumulative ACK scenario

                                                        Host B

                                                        X

                                                        Seq=100 20 bytes data

                                                        ACK=120

                                                        time

                                                        SendBase= 120

                                                        3 Transport Layer 73Comp 361 Spring 2005

                                                        TCP ACK generation [RFC 1122 RFC 2581]

                                                        Event at Receiver

                                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                        Arrival of segment that partially or completely fills gap

                                                        TCP Receiver action

                                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                        Immediately send single cumulative ACK ACKing both in-order segments

                                                        Immediately send duplicate ACK indicating seq of next expected byte

                                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                                        3 Transport Layer 74Comp 361 Spring 2005

                                                        More on Sender Policies

                                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                        3 Transport Layer 75Comp 361 Spring 2005

                                                        Fast Retransmit

                                                        Time-out period often relatively long

                                                        long delay before resending lost packet

                                                        Detect lost segments via duplicate ACKs

                                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                        fast retransmit resend segment before timer expires

                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                        Fast retransmit algorithm

                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                        start timer

                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                        resend segment with sequence number y

                                                        a duplicate ACK for already ACKed segment

                                                        fast retransmit

                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                        TCP GBN or Selective Repeat

                                                        Basic TCP looks a lot like GBN

                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                        This looks a lot like Selective Repeat

                                                        TCP is a hybrid

                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                        Chapter 3 outline

                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                        35 Connection-oriented transport TCP

                                                        segment structurereliable data transferflow controlconnection management

                                                        36 Principles of congestion control37 TCP congestion control

                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                        TCP Flow Control

                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                        transmitting too muchtoo fast

                                                        flow controlreceive side of TCP connection has a receive buffer

                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                        app process may be slow at reading from buffer

                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                        TCP segment structure

                                                        source port dest port

                                                        32 bits

                                                        applicationdata

                                                        (variable length)

                                                        sequence numberacknowledgement number

                                                        Receive windowUrg data pnterchecksum

                                                        FSRPAUheadlen

                                                        notused

                                                        Options (variable length)

                                                        URG urgent data (generally not used)

                                                        ACK ACK valid

                                                        PSH push data now(generally not used)

                                                        RST SYN FINconnection estab(setup teardown

                                                        commands)

                                                        bytes rcvr willingto accept

                                                        Internetchecksum

                                                        (as in UDP)

                                                        countingby bytes of data(not segments)

                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                        TCP Flow control how it works

                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                        LastByteRead]

                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                        guarantees receive buffer doesnrsquot overflow

                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                        Technical Issue

                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                        Note on UDP

                                                        UDP has no flow control

                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                        Chapter 3 outline

                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                        35 Connection-oriented transport TCP

                                                        segment structurereliable data transferflow controlconnection management

                                                        36 Principles of congestion control37 TCP congestion control

                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                        TCP Connection Management

                                                        Three way handshakeStep 1 client end system sends

                                                        TCP SYN control segment to server

                                                        specifies client_isn the initial seq No application data

                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                        seq sbuffers flow control info (eg RcvWindow)

                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                        TCP Connection Management (cont)

                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                        Allocate buffersAllocates buffersCan include application data

                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                        clientConnection request (SYN=1 seq=client_isn)

                                                        server

                                                        Connection granted (SYN=1 server_isn

                                                        ACK (SYN=0 seq=client_isn+1)

                                                        ack=client_isn+1)

                                                        ack=server_isn+1

                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                        TCP Connection Management (cont)

                                                        Closing a connection

                                                        client closes socketclientSocketclose()

                                                        Step 1 client end system sends TCP FIN control segment to server

                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                        client

                                                        FIN

                                                        server

                                                        ACK

                                                        ACK

                                                        FIN

                                                        close

                                                        close

                                                        closed

                                                        tim

                                                        ed w

                                                        ait

                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                        TCP Connection Management (cont)

                                                        Step 3 client receives FIN replies with ACK

                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                        Closes down after timed-wait

                                                        Step 4 server receives ACK Connection closed

                                                        Note with small modification can handle simultaneous FINs

                                                        client

                                                        FIN

                                                        server

                                                        ACK

                                                        ACK

                                                        FIN

                                                        closing

                                                        closing

                                                        closed

                                                        tim

                                                        ed w

                                                        ait

                                                        closed

                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                        TCP Connection Management (cont)

                                                        ExampleTCP serverlifecycle

                                                        Example TCP clientlifecycle

                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                        A few special cases

                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                        Chapter 3 outline

                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                        35 Connection-oriented transport TCP

                                                        segment structurereliable data transferflow controlconnection management

                                                        36 Principles of congestion control37 TCP congestion control

                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                        Principles of Congestion Control

                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                        a top-10 problem

                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                        large delays when congestedmaximum achievable throughput

                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                        Causescosts of congestion scenario 2

                                                        one router finite buffers sender retransmission of lost packet

                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                        λin λout=

                                                        λin λoutgtλ

                                                        inλout

                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                        (c)(a) (b)

                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                        λin

                                                        Q what happens as and increase λ

                                                        in

                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                        Causescosts of congestion scenario 3

                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                        Approaches towards congestion control

                                                        Two broad approaches towards congestion control

                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                        Case study ATM ABR congestion control

                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                        RM cells returned to sender by receiver with bits intact

                                                        small exception ndash see next page

                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                        sender should use available bandwidth

                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                        Case study ATM ABR congestion control

                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                        Chapter 3 outline

                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                        35 Connection-oriented transport TCP

                                                        segment structurereliable data transferflow controlconnection management

                                                        36 Principles of congestion control37 TCP congestion control

                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                        Congwin

                                                        w segments each with MSS bytes sent in one RTT

                                                        throughput = w MSSRTT Bytessec

                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                        LastByteSent-LastByteAcked le CongWin

                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                        cut CongWin in half after loss event

                                                        8 Kbytes

                                                        16 Kbytes

                                                        24 Kbytes

                                                        time

                                                        congestionwindow

                                                        Long-lived TCP connection

                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                        TCP Slow Start

                                                        When connection begins CongWin = 1 MSS

                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                        available bandwidth may be gtgt MSSRTT

                                                        desirable to quickly ramp up to respectable rate

                                                        When connection begins increase rate exponentially fast until first loss event

                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                        TCP Slow Start (more)

                                                        When connection begins increase rate exponentially until first loss event

                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                        Summary initial rate is slow but ramps up exponentially fast

                                                        Host A

                                                        one segment

                                                        RTT

                                                        Host B

                                                        time

                                                        two segments

                                                        four segments

                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                        Summary TCP Congestion Control

                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                        The Big Picture

                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                        ACK receipt for previously unackeddata

                                                        Slow Start (SS)

                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                        set state to ldquoCongestion Avoidancerdquo

                                                        Resulting in a doubling of CongWin every RTT

                                                        ACK receipt for previously unackeddata

                                                        CongestionAvoidance (CA)

                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                        Loss event detected by triple duplicate ACK

                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                        Enter slow start

                                                        Duplicate ACK

                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                        CongWin and Threshold not changed

                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                        TCP throughput

                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                        TCP Futures

                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                        LRTTMSSsdot221

                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                        TCP connection 1

                                                        bottleneckrouter

                                                        capacity R

                                                        TCP connection 2

                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                        Why is TCP fairTwo competing sessions

                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                        R

                                                        R

                                                        equal bandwidth share

                                                        Connection 1 throughput

                                                        Conn

                                                        ecti

                                                        on 2

                                                        thr

                                                        ough

                                                        p ut

                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                        Fairness (more)Fairness and UDP

                                                        Multimedia apps often do not use TCP

                                                        do not want rate throttled by congestion control

                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                        Current Research area How to keep UDP from congesting the internet

                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                        TCP Latency ModelingNotation assumptions

                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                        modeling slow start

                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                        Fixed Congestion Window (W)Two cases

                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                        Fixed congestion window (1)

                                                        First caseWSR gt RTT + SR ACK for

                                                        first segment in window returns before windowrsquos worth of data sent

                                                        latency = 2RTT + OR

                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                        Fixed congestion window (2)

                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                        TCP Latency Modeling Slow Start (1)

                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                        Will show that the delay for one object is

                                                        RS

                                                        RSRTTP

                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                        ⎤⎢⎣⎡ +++=

                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                        - and K is the number of windows that cover the object

                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                        TCP Latency Modeling Slow Start (2)

                                                        RTT

                                                        initiate TCPconnection

                                                        requestobject

                                                        first window= SR

                                                        second window= 2SR

                                                        third window= 4SR

                                                        fourth window= 8SR

                                                        completetransmissionobject

                                                        delivered

                                                        time atclient

                                                        time atserver

                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                        Server idles P=2 times

                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                        Server idles P = minK-1Q times

                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                        TCP Latency Modeling (3)

                                                        ementacknowledg receivesserver until

                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                        RS

                                                        RSRTTPRTT

                                                        RO

                                                        RSRTT

                                                        RSRTT

                                                        RO

                                                        idleTimeRTTRO

                                                        P

                                                        kP

                                                        k

                                                        P

                                                        pp

                                                        )12(][2

                                                        ]2[2

                                                        2delay

                                                        1

                                                        1

                                                        1

                                                        minusminus+++=

                                                        minus+++=

                                                        ++=

                                                        minus

                                                        =

                                                        =

                                                        sum

                                                        sum

                                                        th window after the timeidle 2 1 kRSRTT

                                                        RS k =⎥⎦

                                                        ⎤⎢⎣⎡ minus+

                                                        +minus

                                                        window kth the transmit totime2 1 =minus

                                                        RSk

                                                        RTT

                                                        initiate TCPconnection

                                                        requestobject

                                                        first window= SR

                                                        second window= 2SR

                                                        third window= 4SR

                                                        fourth window= 8SR

                                                        completetransmissionobject

                                                        delivered

                                                        time atclient

                                                        time atserver

                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                        How do we calculate K

                                                        ⎥⎥⎤

                                                        ⎢⎢⎡ +=

                                                        +ge=

                                                        geminus=

                                                        ge+++=

                                                        ge+++=minus

                                                        minus

                                                        )1(log

                                                        )1(logmin

                                                        12min

                                                        222min222min

                                                        2

                                                        2

                                                        110

                                                        110

                                                        SO

                                                        SOkk

                                                        SOk

                                                        SOkOSSSkK

                                                        k

                                                        k

                                                        k

                                                        L

                                                        L

                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                        HTTP ModelingAssume Web page consists of

                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                        02468

                                                        101214161820

                                                        28Kbps

                                                        100Kbps

                                                        1 Mbps 10Mbps

                                                        non-persistent

                                                        persistent

                                                        parallel non-persistent

                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                        HTTP Response time (in seconds)

                                                        0

                                                        10

                                                        20

                                                        30

                                                        40

                                                        50

                                                        60

                                                        70

                                                        28Kbps

                                                        100Kbps

                                                        1 Mbps 10Mbps

                                                        non-persistent

                                                        persistent

                                                        parallel non-persistent

                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                        instantiation and implementation in the Internet

                                                        UDPTCP

                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                        • Chapter 3 Transport Layer last revised 160305
                                                        • Chapter 3 outline
                                                        • Transport services and protocols
                                                        • Transport vs network layer
                                                        • Transport-layer protocols
                                                        • Chapter 3 outline
                                                        • Multiplexingdemultiplexing
                                                        • Multiplexingdemultiplexing
                                                        • How demultiplexing works
                                                        • Connectionless demultiplexing
                                                        • Connectionless demux (cont)
                                                        • Connection-oriented demux
                                                        • Connection-oriented demux (cont)
                                                        • Connection-oriented demux Threaded Web Server
                                                        • Chapter 3 outline
                                                        • UDP User Datagram Protocol [RFC 768]
                                                        • UDP more
                                                        • UDP checksum
                                                        • Chapter 3 outline
                                                        • Principles of Reliable data transfer
                                                        • Reliable data transfer getting started
                                                        • Reliable data transfer getting started
                                                        • Incremental Improvements
                                                        • Rdt10 reliable transfer over a reliable channel
                                                        • Rdt20 channel with bit errors
                                                        • rdt20 FSM specification
                                                        • rdt20 operation with no errors
                                                        • rdt20 error scenario
                                                        • rdt20 has a fatal flaw
                                                        • rdt21 sender handles garbled ACKNAKs
                                                        • rdt21 receiver handles garbled ACKNAKs
                                                        • rdt21 discussion
                                                        • rdt22 a NAK-free protocol
                                                        • rdt22 sender receiver fragments
                                                        • rdt30 channels with errors and loss
                                                        • rdt30 sender
                                                        • rdt30 in action
                                                        • rdt30 in action
                                                        • Performance of rdt30
                                                        • rdt30 stop-and-wait operation
                                                        • Pipelined protocols
                                                        • Pipelined protocols
                                                        • Pipelining increased utilization
                                                        • Go-Back-N
                                                        • GBN Sender
                                                        • GBN sender extended FSM
                                                        • GBN receiver extended FSM
                                                        • More on receiver
                                                        • GBN inaction
                                                        • Selective Repeat
                                                        • Selective repeat sender receiver windows
                                                        • Selective repeat
                                                        • Selective repeat in action
                                                        • Selective repeat dilemma
                                                        • Chapter 3 outline
                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                        • More TCP Details
                                                        • Even More TCP Details
                                                        • TCP segment structure
                                                        • TCP seq rsquos and ACKs
                                                        • TCP Round Trip Time and Timeout
                                                        • TCP Round Trip Time and Timeout
                                                        • Example RTT estimation
                                                        • TCP Round Trip Time and Timeout
                                                        • Chapter 3 outline
                                                        • TCP reliable data transfer
                                                        • TCP sender events
                                                        • TCP sender(simplified)
                                                        • TCP retransmission scenarios
                                                        • TCP retransmission scenarios (more)
                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                        • More on Sender Policies
                                                        • Fast Retransmit
                                                        • Fast retransmit algorithm
                                                        • TCP GBN or Selective Repeat
                                                        • Chapter 3 outline
                                                        • TCP Flow Control
                                                        • TCP Flow Control
                                                        • TCP segment structure
                                                        • TCP Flow control how it works
                                                        • Technical Issue
                                                        • Chapter 3 outline
                                                        • TCP Connection Management
                                                        • TCP Connection Management (cont)
                                                        • TCP Connection Management (cont)
                                                        • TCP Connection Management (cont)
                                                        • TCP Connection Management (cont)
                                                        • A few special cases
                                                        • Chapter 3 outline
                                                        • Principles of Congestion Control
                                                        • Causescosts of congestion scenario 1
                                                        • Causescosts of congestion scenario 2
                                                        • Causescosts of congestion scenario 3
                                                        • Causescosts of congestion scenario 3
                                                        • Approaches towards congestion control
                                                        • Case study ATM ABR congestion control
                                                        • Case study ATM ABR congestion control
                                                        • Chapter 3 outline
                                                        • TCP Congestion Control
                                                        • TCP AIMD
                                                        • TCP Slow Start
                                                        • TCP Slow Start (more)
                                                        • Summary TCP Congestion Control
                                                        • The Big Picture
                                                        • TCP sender congestion control
                                                        • TCP throughput
                                                        • TCP Futures
                                                        • TCP Fairness
                                                        • Why is TCP fair
                                                        • Fairness (more)
                                                        • TCP Latency Modeling
                                                        • Fixed Congestion Window (W)
                                                        • Fixed congestion window (1)
                                                        • Fixed congestion window (2)
                                                        • TCP Latency Modeling Slow Start (1)
                                                        • TCP Latency Modeling Slow Start (2)
                                                        • TCP Latency Modeling (3)
                                                        • TCP Latency Modeling (4)
                                                        • HTTP Modeling
                                                        • Chapter 3 Summary

                                                          3 Transport Layer 29Comp 361 Spring 2005

                                                          rdt20 has a fatal flawWhat happens if ACKNAK

                                                          corruptedsender doesnrsquot know what happened at receivercanrsquot just retransmit possible duplicate But receiver waiting

                                                          What to dosender ACKsNAKs receiverrsquos ACKNAK What if sender ACKNAK corruptedretransmit but this might cause retransmission of correctly received pktReceiver wonrsquot know about duplication

                                                          Handling duplicates sender adds sequence number(01) to each pktsender retransmits current pkt if ACKNAK garbledreceiver discards (doesnrsquot deliver up) duplicate pktDuplicate packet is one with same sequence as previous packet

                                                          Sender sends one packet then waits for receiver response

                                                          stop and wait

                                                          3 Transport Layer 30Comp 361 Spring 2005

                                                          Sender whenever sender receives control message it sends a packet to receiver

                                                          A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                          Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                          Note ACKNAK do not contain sequence

                                                          3 Transport Layer 31Comp 361 Spring 2005

                                                          rdt21 sender handles garbled ACKNAKs

                                                          Wait for call 0 from

                                                          above

                                                          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                          rdt_send(data)

                                                          Wait for ACK or NAK 0 udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                          rdt_send(data)

                                                          udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                          Wait forcall 1 from

                                                          above

                                                          Wait for ACK or NAK 1

                                                          ΛΛ

                                                          3 Transport Layer 32Comp 361 Spring 2005

                                                          rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                          ampamp has_seq0(rcvpkt)

                                                          Wait for 0 from below

                                                          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                          Wait for 1 from below

                                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                          sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                          sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                          3 Transport Layer 33Comp 361 Spring 2005

                                                          rdt21 discussion

                                                          Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                          state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                          Receivermust check if received packet is duplicate

                                                          state indicates whether 0 or 1 is expected pkt seq

                                                          note receiver can notknow if its last ACKNAK received OK at sender

                                                          3 Transport Layer 34Comp 361 Spring 2005

                                                          rdt22 a NAK-free protocol

                                                          same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                          receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                          duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                          3 Transport Layer 35Comp 361 Spring 2005

                                                          rdt22 sender receiver fragments

                                                          Wait for call 0 from

                                                          above

                                                          sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                          rdt_send(data)

                                                          udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                          isACK(rcvpkt1) )

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                          Wait for ACK

                                                          0sender FSM

                                                          fragment

                                                          Wait for 0 from below

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                          rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                          has_seq1(rcvpkt))

                                                          udt_send(sndpkt)receiver FSM

                                                          fragment

                                                          Λ

                                                          3 Transport Layer 36Comp 361 Spring 2005

                                                          rdt30 channels with errors and loss

                                                          New assumptionunderlying channel can also lose packets (data or ACKs)

                                                          checksum seq ACKs retransmissions will be of help but not enough

                                                          Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                          Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                          retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                          requires countdown timer

                                                          3 Transport Layer 37Comp 361 Spring 2005

                                                          rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                          rdt_send(data)

                                                          Wait for

                                                          ACK0

                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                          Wait for call 1 from

                                                          above

                                                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                          rdt_send(data)

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                          stop_timerstop_timer

                                                          udt_send(sndpkt)start_timer

                                                          timeout

                                                          udt_send(sndpkt)start_timer

                                                          timeout

                                                          rdt_rcv(rcvpkt)

                                                          Wait for call 0from

                                                          above

                                                          Wait for

                                                          ACK1

                                                          Λrdt_rcv(rcvpkt)

                                                          ΛΛ

                                                          Λ

                                                          3 Transport Layer 38Comp 361 Spring 2005

                                                          rdt30 in action

                                                          3 Transport Layer 39Comp 361 Spring 2005

                                                          rdt30 in action

                                                          3 Transport Layer 40Comp 361 Spring 2005

                                                          Performance of rdt30

                                                          rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                          L (packet length in bits)R (transmission rate bps)

                                                          8kbpkt109 bsec

                                                          Ttransmit = = = 8 microsec

                                                          U sender =

                                                          00830008

                                                          = 000027 L R RTT + L R

                                                          =

                                                          U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                          rdt30 stop-and-wait operation

                                                          first packet bit transmitted t = 0

                                                          sender receiver

                                                          RTT

                                                          last packet bit transmitted t = L R

                                                          first packet bit arriveslast packet bit arrives send ACK

                                                          ACK arrives send next packet t = RTT + L R

                                                          U sender =

                                                          008 30008

                                                          = 000027 L R RTT + L R

                                                          =

                                                          3 Transport Layer 41Comp 361 Spring 2005

                                                          3 Transport Layer 42Comp 361 Spring 2005

                                                          Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                          range of sequence numbers must be increasedbuffering at sender andor receiver

                                                          3 Transport Layer 43Comp 361 Spring 2005

                                                          Pipelined protocols

                                                          Advantage much better bandwidth utilization than stop-and-wait

                                                          Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                          Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                          Note TCP is not exactly either

                                                          Pipelining increased utilization

                                                          first packet bit transmitted t = 0

                                                          sender receiver

                                                          RTT

                                                          last bit transmitted t = L R

                                                          first packet bit arriveslast packet bit arrives send ACK

                                                          ACK arrives send next packet t = RTT + L R

                                                          last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                          U sender =

                                                          02430008

                                                          = 00008 3 L R RTT + L R

                                                          =

                                                          Increase utilizationby a factor of 3

                                                          3 Transport Layer 44Comp 361 Spring 2005

                                                          3 Transport Layer 45Comp 361 Spring 2005

                                                          Go-Back-NSender

                                                          k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                          ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                          Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                          3 Transport Layer 46Comp 361 Spring 2005

                                                          GBN Sender

                                                          rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                          Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                          Timeout resends ALL packets that have been sent but not yet acknowledged

                                                          This is only event that triggers resend

                                                          3 Transport Layer 47Comp 361 Spring 2005

                                                          GBN sender extended FSMrdt_send(data)

                                                          Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                          timeout

                                                          if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                          start_timernextseqnum++

                                                          elserefuse_data(data)

                                                          base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                          stop_timerelse

                                                          start_timer

                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                          base=1nextseqnum=1

                                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                          Λ

                                                          3 Transport Layer 48Comp 361 Spring 2005

                                                          GBN receiver extended FSM

                                                          Wait

                                                          udt_send(sndpkt)default

                                                          rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                          expectedseqnum=1sndpkt =

                                                          make_pkt(0ACKchksum)

                                                          Λ

                                                          If expected packet receivedSend ACK and deliver packet upstairs

                                                          If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                          3 Transport Layer 49Comp 361 Spring 2005

                                                          More on receiver

                                                          The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                          3 Transport Layer 50Comp 361 Spring 2005

                                                          GBN inaction

                                                          GBN is easy to code but might have performance problems

                                                          In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                          Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                          3 Transport Layer 51Comp 361 Spring 2005

                                                          3 Transport Layer 52Comp 361 Spring 2005

                                                          Selective Repeat

                                                          receiver individually acknowledges all correctly received pkts

                                                          buffers pkts as needed for eventual in-order delivery to upper layer

                                                          sender only resends pkts for which ACK not received

                                                          sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                          sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                          3 Transport Layer 53Comp 361 Spring 2005

                                                          Selective repeat sender receiver windows

                                                          3 Transport Layer 54Comp 361 Spring 2005

                                                          Selective repeat

                                                          pkt n in [rcvbase rcvbase+N-1]

                                                          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                          pkt n in [rcvbase-Nrcvbase-1]

                                                          ACK(n) (note this is a reACK)

                                                          otherwiseignore

                                                          receiverdata from above

                                                          if next available seq in window send pkt

                                                          timeout(n)resend pkt n restart timer

                                                          ACK(n) in [sendbasesendbase+N]

                                                          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                          sender

                                                          3 Transport Layer 55Comp 361 Spring 2005

                                                          Selective repeat in action

                                                          3 Transport Layer 56Comp 361 Spring 2005

                                                          Selective repeatdilemma

                                                          Example seq rsquos 0 1 2 3window size=3

                                                          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                          Q what is relationship between seq size and window size

                                                          3 Transport Layer 57Comp 361 Spring 2005

                                                          Chapter 3 outline

                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                          35 Connection-oriented transport TCP

                                                          segment structurereliable data transferflow controlconnection management

                                                          36 Principles of congestion control37 TCP congestion control

                                                          3 Transport Layer 58Comp 361 Spring 2005

                                                          TCP Overview RFCs 793 1122 1323 2018 2581

                                                          full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                          flow controlledsender will not overwhelm receiver

                                                          point-to-pointone sender one receiver

                                                          reliable in-order byte steam

                                                          no ldquomessage boundariesrdquopipelined

                                                          TCP congestion and flow control set window size

                                                          send amp receive buffers

                                                          socketdoor

                                                          TCPsend buffer

                                                          TCPreceive buffer

                                                          socketdoor

                                                          segment

                                                          applicationwrites data

                                                          applicationreads data

                                                          3 Transport Layer 59Comp 361 Spring 2005

                                                          More TCP DetailsMaximum Segment Size (MSS)

                                                          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                          Application Data + TCP Header = TCP Segment

                                                          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                          (again no payload)Client responds with third special segment

                                                          This can contain payload

                                                          3 Transport Layer 60Comp 361 Spring 2005

                                                          Even More TCP Details

                                                          A TCP connection between client and server creates in both client and server

                                                          (i) buffers(ii) variables and

                                                          (iii) a socket connection to process

                                                          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                          any of the network elements between the host and server

                                                          3 Transport Layer 61Comp 361 Spring 2005

                                                          TCP segment structure

                                                          source port dest port

                                                          32 bits

                                                          applicationdata

                                                          (variable length)

                                                          sequence numberacknowledgement number

                                                          Receive windowUrg data pnterchecksum

                                                          FSRPAUheadlen

                                                          notused

                                                          Options (variable length)

                                                          URG urgent data (generally not used)

                                                          ACK ACK valid

                                                          PSH push data now(generally not used)

                                                          RST SYN FINconnection estab(setup teardown

                                                          commands)

                                                          bytes rcvr willingto accept

                                                          Internetchecksum

                                                          (as in UDP)

                                                          countingby bytes of data(not segments)

                                                          3 Transport Layer 62Comp 361 Spring 2005

                                                          TCP seq rsquos and ACKsSeq rsquos

                                                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                          ACKsseq of next byte expected from other sidecumulative ACK

                                                          Q how receiver handles out-of-order segments

                                                          A TCP spec doesnrsquot say - up to implementer

                                                          Host BHost A

                                                          Seq=42 ACK=79 data = lsquoCrsquo

                                                          Seq=79 ACK=43 data = lsquoCrsquo

                                                          Seq=43 ACK=80

                                                          Usertypes

                                                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                          back lsquoCrsquo

                                                          host ACKsreceipt

                                                          of echoedlsquoCrsquo

                                                          timesimple telnet scenario

                                                          3 Transport Layer 63Comp 361 Spring 2005

                                                          TCP Round Trip Time and Timeout

                                                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                          average several recent measurements not just current SampleRTT

                                                          Q how to set TCP timeout valuelonger than RTT

                                                          but RTT variestoo short premature timeout

                                                          unnecessary retransmissions

                                                          too long slow reaction to segment loss

                                                          3 Transport Layer 64Comp 361 Spring 2005

                                                          TCP Round Trip Time and Timeout

                                                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                          3 Transport Layer 65Comp 361 Spring 2005

                                                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                          100

                                                          150

                                                          200

                                                          250

                                                          300

                                                          350

                                                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                          time (seconnds)

                                                          RTT

                                                          (mill

                                                          iseco

                                                          nds)

                                                          SampleRTT Estimated RTT

                                                          3 Transport Layer 66Comp 361 Spring 2005

                                                          TCP Round Trip Time and Timeout

                                                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                          (typically β = 025)

                                                          Then set timeout interval

                                                          TimeoutInterval = EstimatedRTT + 4DevRTT

                                                          3 Transport Layer 67Comp 361 Spring 2005

                                                          Chapter 3 outline

                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                          35 Connection-oriented transport TCP

                                                          segment structurereliable data transferflow controlconnection management

                                                          36 Principles of congestion control37 TCP congestion control

                                                          3 Transport Layer 68Comp 361 Spring 2005

                                                          TCP reliable data transfer

                                                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                          Retransmissions are triggered by

                                                          timeout eventsduplicate acks

                                                          Initially consider simplified TCP sender

                                                          ignore duplicate acksignore flow control congestion control

                                                          3 Transport Layer 69Comp 361 Spring 2005

                                                          TCP sender eventsdata rcvd from app

                                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                          timeoutretransmit segment that caused timeoutrestart timer

                                                          Ack rcvdIf acknowledges previously unackedsegments

                                                          update what is known to be ackedstart timer if there are outstanding segments

                                                          TCP sender(simplified)

                                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                          loop (forever) switch(event)

                                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                                          smallest sequence numberstart timer

                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                          start timer

                                                          end of loop forever

                                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                          3 Transport Layer 70Comp 361 Spring 2005

                                                          3 Transport Layer 71Comp 361 Spring 2005

                                                          TCP retransmission scenariosHost A

                                                          Seq=100 20 bytes data

                                                          ACK=100

                                                          timepremature timeout

                                                          Host B

                                                          Seq=92 8 bytes data

                                                          ACK=120

                                                          Seq=92 8 bytes data

                                                          Seq=

                                                          92 t

                                                          imeo

                                                          ut

                                                          ACK=120

                                                          Host A

                                                          Seq=92 8 bytes data

                                                          ACK=100

                                                          loss

                                                          tim

                                                          eout

                                                          lost ACK scenario

                                                          Host B

                                                          X

                                                          Seq=92 8 bytes data

                                                          ACK=100

                                                          time

                                                          SendBase= 120

                                                          SendBase= 120

                                                          Sendbase= 100

                                                          Seq=

                                                          92 t

                                                          imeo

                                                          utSendBase

                                                          = 100

                                                          3 Transport Layer 72Comp 361 Spring 2005

                                                          TCP retransmission scenarios (more)Host A

                                                          Seq=92 8 bytes data

                                                          ACK=100

                                                          loss

                                                          tim

                                                          eout

                                                          Cumulative ACK scenario

                                                          Host B

                                                          X

                                                          Seq=100 20 bytes data

                                                          ACK=120

                                                          time

                                                          SendBase= 120

                                                          3 Transport Layer 73Comp 361 Spring 2005

                                                          TCP ACK generation [RFC 1122 RFC 2581]

                                                          Event at Receiver

                                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                          Arrival of segment that partially or completely fills gap

                                                          TCP Receiver action

                                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                          Immediately send single cumulative ACK ACKing both in-order segments

                                                          Immediately send duplicate ACK indicating seq of next expected byte

                                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                                          3 Transport Layer 74Comp 361 Spring 2005

                                                          More on Sender Policies

                                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                          3 Transport Layer 75Comp 361 Spring 2005

                                                          Fast Retransmit

                                                          Time-out period often relatively long

                                                          long delay before resending lost packet

                                                          Detect lost segments via duplicate ACKs

                                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                          fast retransmit resend segment before timer expires

                                                          3 Transport Layer 76Comp 361 Spring 2005

                                                          Fast retransmit algorithm

                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                          start timer

                                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                          resend segment with sequence number y

                                                          a duplicate ACK for already ACKed segment

                                                          fast retransmit

                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                          TCP GBN or Selective Repeat

                                                          Basic TCP looks a lot like GBN

                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                          This looks a lot like Selective Repeat

                                                          TCP is a hybrid

                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                          Chapter 3 outline

                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                          35 Connection-oriented transport TCP

                                                          segment structurereliable data transferflow controlconnection management

                                                          36 Principles of congestion control37 TCP congestion control

                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                          TCP Flow Control

                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                          transmitting too muchtoo fast

                                                          flow controlreceive side of TCP connection has a receive buffer

                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                          app process may be slow at reading from buffer

                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                          TCP segment structure

                                                          source port dest port

                                                          32 bits

                                                          applicationdata

                                                          (variable length)

                                                          sequence numberacknowledgement number

                                                          Receive windowUrg data pnterchecksum

                                                          FSRPAUheadlen

                                                          notused

                                                          Options (variable length)

                                                          URG urgent data (generally not used)

                                                          ACK ACK valid

                                                          PSH push data now(generally not used)

                                                          RST SYN FINconnection estab(setup teardown

                                                          commands)

                                                          bytes rcvr willingto accept

                                                          Internetchecksum

                                                          (as in UDP)

                                                          countingby bytes of data(not segments)

                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                          TCP Flow control how it works

                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                          LastByteRead]

                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                          guarantees receive buffer doesnrsquot overflow

                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                          Technical Issue

                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                          Note on UDP

                                                          UDP has no flow control

                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                          Chapter 3 outline

                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                          35 Connection-oriented transport TCP

                                                          segment structurereliable data transferflow controlconnection management

                                                          36 Principles of congestion control37 TCP congestion control

                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                          TCP Connection Management

                                                          Three way handshakeStep 1 client end system sends

                                                          TCP SYN control segment to server

                                                          specifies client_isn the initial seq No application data

                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                          seq sbuffers flow control info (eg RcvWindow)

                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                          TCP Connection Management (cont)

                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                          Allocate buffersAllocates buffersCan include application data

                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                          clientConnection request (SYN=1 seq=client_isn)

                                                          server

                                                          Connection granted (SYN=1 server_isn

                                                          ACK (SYN=0 seq=client_isn+1)

                                                          ack=client_isn+1)

                                                          ack=server_isn+1

                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                          TCP Connection Management (cont)

                                                          Closing a connection

                                                          client closes socketclientSocketclose()

                                                          Step 1 client end system sends TCP FIN control segment to server

                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                          client

                                                          FIN

                                                          server

                                                          ACK

                                                          ACK

                                                          FIN

                                                          close

                                                          close

                                                          closed

                                                          tim

                                                          ed w

                                                          ait

                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                          TCP Connection Management (cont)

                                                          Step 3 client receives FIN replies with ACK

                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                          Closes down after timed-wait

                                                          Step 4 server receives ACK Connection closed

                                                          Note with small modification can handle simultaneous FINs

                                                          client

                                                          FIN

                                                          server

                                                          ACK

                                                          ACK

                                                          FIN

                                                          closing

                                                          closing

                                                          closed

                                                          tim

                                                          ed w

                                                          ait

                                                          closed

                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                          TCP Connection Management (cont)

                                                          ExampleTCP serverlifecycle

                                                          Example TCP clientlifecycle

                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                          A few special cases

                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                          Chapter 3 outline

                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                          35 Connection-oriented transport TCP

                                                          segment structurereliable data transferflow controlconnection management

                                                          36 Principles of congestion control37 TCP congestion control

                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                          Principles of Congestion Control

                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                          a top-10 problem

                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                          large delays when congestedmaximum achievable throughput

                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                          Causescosts of congestion scenario 2

                                                          one router finite buffers sender retransmission of lost packet

                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                          λin λout=

                                                          λin λoutgtλ

                                                          inλout

                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                          (c)(a) (b)

                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                          λin

                                                          Q what happens as and increase λ

                                                          in

                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                          Causescosts of congestion scenario 3

                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                          Approaches towards congestion control

                                                          Two broad approaches towards congestion control

                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                          Case study ATM ABR congestion control

                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                          RM cells returned to sender by receiver with bits intact

                                                          small exception ndash see next page

                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                          sender should use available bandwidth

                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                          Case study ATM ABR congestion control

                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                          Chapter 3 outline

                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                          35 Connection-oriented transport TCP

                                                          segment structurereliable data transferflow controlconnection management

                                                          36 Principles of congestion control37 TCP congestion control

                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                          Congwin

                                                          w segments each with MSS bytes sent in one RTT

                                                          throughput = w MSSRTT Bytessec

                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                          LastByteSent-LastByteAcked le CongWin

                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                          cut CongWin in half after loss event

                                                          8 Kbytes

                                                          16 Kbytes

                                                          24 Kbytes

                                                          time

                                                          congestionwindow

                                                          Long-lived TCP connection

                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                          TCP Slow Start

                                                          When connection begins CongWin = 1 MSS

                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                          available bandwidth may be gtgt MSSRTT

                                                          desirable to quickly ramp up to respectable rate

                                                          When connection begins increase rate exponentially fast until first loss event

                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                          TCP Slow Start (more)

                                                          When connection begins increase rate exponentially until first loss event

                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                          Summary initial rate is slow but ramps up exponentially fast

                                                          Host A

                                                          one segment

                                                          RTT

                                                          Host B

                                                          time

                                                          two segments

                                                          four segments

                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                          Summary TCP Congestion Control

                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                          The Big Picture

                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                          ACK receipt for previously unackeddata

                                                          Slow Start (SS)

                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                          set state to ldquoCongestion Avoidancerdquo

                                                          Resulting in a doubling of CongWin every RTT

                                                          ACK receipt for previously unackeddata

                                                          CongestionAvoidance (CA)

                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                          Loss event detected by triple duplicate ACK

                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                          Enter slow start

                                                          Duplicate ACK

                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                          CongWin and Threshold not changed

                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                          TCP throughput

                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                          TCP Futures

                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                          LRTTMSSsdot221

                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                          TCP connection 1

                                                          bottleneckrouter

                                                          capacity R

                                                          TCP connection 2

                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                          Why is TCP fairTwo competing sessions

                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                          R

                                                          R

                                                          equal bandwidth share

                                                          Connection 1 throughput

                                                          Conn

                                                          ecti

                                                          on 2

                                                          thr

                                                          ough

                                                          p ut

                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                          Fairness (more)Fairness and UDP

                                                          Multimedia apps often do not use TCP

                                                          do not want rate throttled by congestion control

                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                          Current Research area How to keep UDP from congesting the internet

                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                          TCP Latency ModelingNotation assumptions

                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                          modeling slow start

                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                          Fixed Congestion Window (W)Two cases

                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                          Fixed congestion window (1)

                                                          First caseWSR gt RTT + SR ACK for

                                                          first segment in window returns before windowrsquos worth of data sent

                                                          latency = 2RTT + OR

                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                          Fixed congestion window (2)

                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                          TCP Latency Modeling Slow Start (1)

                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                          Will show that the delay for one object is

                                                          RS

                                                          RSRTTP

                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                          ⎤⎢⎣⎡ +++=

                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                          - and K is the number of windows that cover the object

                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                          TCP Latency Modeling Slow Start (2)

                                                          RTT

                                                          initiate TCPconnection

                                                          requestobject

                                                          first window= SR

                                                          second window= 2SR

                                                          third window= 4SR

                                                          fourth window= 8SR

                                                          completetransmissionobject

                                                          delivered

                                                          time atclient

                                                          time atserver

                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                          Server idles P=2 times

                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                          Server idles P = minK-1Q times

                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                          TCP Latency Modeling (3)

                                                          ementacknowledg receivesserver until

                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                          RS

                                                          RSRTTPRTT

                                                          RO

                                                          RSRTT

                                                          RSRTT

                                                          RO

                                                          idleTimeRTTRO

                                                          P

                                                          kP

                                                          k

                                                          P

                                                          pp

                                                          )12(][2

                                                          ]2[2

                                                          2delay

                                                          1

                                                          1

                                                          1

                                                          minusminus+++=

                                                          minus+++=

                                                          ++=

                                                          minus

                                                          =

                                                          =

                                                          sum

                                                          sum

                                                          th window after the timeidle 2 1 kRSRTT

                                                          RS k =⎥⎦

                                                          ⎤⎢⎣⎡ minus+

                                                          +minus

                                                          window kth the transmit totime2 1 =minus

                                                          RSk

                                                          RTT

                                                          initiate TCPconnection

                                                          requestobject

                                                          first window= SR

                                                          second window= 2SR

                                                          third window= 4SR

                                                          fourth window= 8SR

                                                          completetransmissionobject

                                                          delivered

                                                          time atclient

                                                          time atserver

                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                          How do we calculate K

                                                          ⎥⎥⎤

                                                          ⎢⎢⎡ +=

                                                          +ge=

                                                          geminus=

                                                          ge+++=

                                                          ge+++=minus

                                                          minus

                                                          )1(log

                                                          )1(logmin

                                                          12min

                                                          222min222min

                                                          2

                                                          2

                                                          110

                                                          110

                                                          SO

                                                          SOkk

                                                          SOk

                                                          SOkOSSSkK

                                                          k

                                                          k

                                                          k

                                                          L

                                                          L

                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                          HTTP ModelingAssume Web page consists of

                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                          02468

                                                          101214161820

                                                          28Kbps

                                                          100Kbps

                                                          1 Mbps 10Mbps

                                                          non-persistent

                                                          persistent

                                                          parallel non-persistent

                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                          HTTP Response time (in seconds)

                                                          0

                                                          10

                                                          20

                                                          30

                                                          40

                                                          50

                                                          60

                                                          70

                                                          28Kbps

                                                          100Kbps

                                                          1 Mbps 10Mbps

                                                          non-persistent

                                                          persistent

                                                          parallel non-persistent

                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                          instantiation and implementation in the Internet

                                                          UDPTCP

                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                          • Chapter 3 Transport Layer last revised 160305
                                                          • Chapter 3 outline
                                                          • Transport services and protocols
                                                          • Transport vs network layer
                                                          • Transport-layer protocols
                                                          • Chapter 3 outline
                                                          • Multiplexingdemultiplexing
                                                          • Multiplexingdemultiplexing
                                                          • How demultiplexing works
                                                          • Connectionless demultiplexing
                                                          • Connectionless demux (cont)
                                                          • Connection-oriented demux
                                                          • Connection-oriented demux (cont)
                                                          • Connection-oriented demux Threaded Web Server
                                                          • Chapter 3 outline
                                                          • UDP User Datagram Protocol [RFC 768]
                                                          • UDP more
                                                          • UDP checksum
                                                          • Chapter 3 outline
                                                          • Principles of Reliable data transfer
                                                          • Reliable data transfer getting started
                                                          • Reliable data transfer getting started
                                                          • Incremental Improvements
                                                          • Rdt10 reliable transfer over a reliable channel
                                                          • Rdt20 channel with bit errors
                                                          • rdt20 FSM specification
                                                          • rdt20 operation with no errors
                                                          • rdt20 error scenario
                                                          • rdt20 has a fatal flaw
                                                          • rdt21 sender handles garbled ACKNAKs
                                                          • rdt21 receiver handles garbled ACKNAKs
                                                          • rdt21 discussion
                                                          • rdt22 a NAK-free protocol
                                                          • rdt22 sender receiver fragments
                                                          • rdt30 channels with errors and loss
                                                          • rdt30 sender
                                                          • rdt30 in action
                                                          • rdt30 in action
                                                          • Performance of rdt30
                                                          • rdt30 stop-and-wait operation
                                                          • Pipelined protocols
                                                          • Pipelined protocols
                                                          • Pipelining increased utilization
                                                          • Go-Back-N
                                                          • GBN Sender
                                                          • GBN sender extended FSM
                                                          • GBN receiver extended FSM
                                                          • More on receiver
                                                          • GBN inaction
                                                          • Selective Repeat
                                                          • Selective repeat sender receiver windows
                                                          • Selective repeat
                                                          • Selective repeat in action
                                                          • Selective repeat dilemma
                                                          • Chapter 3 outline
                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                          • More TCP Details
                                                          • Even More TCP Details
                                                          • TCP segment structure
                                                          • TCP seq rsquos and ACKs
                                                          • TCP Round Trip Time and Timeout
                                                          • TCP Round Trip Time and Timeout
                                                          • Example RTT estimation
                                                          • TCP Round Trip Time and Timeout
                                                          • Chapter 3 outline
                                                          • TCP reliable data transfer
                                                          • TCP sender events
                                                          • TCP sender(simplified)
                                                          • TCP retransmission scenarios
                                                          • TCP retransmission scenarios (more)
                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                          • More on Sender Policies
                                                          • Fast Retransmit
                                                          • Fast retransmit algorithm
                                                          • TCP GBN or Selective Repeat
                                                          • Chapter 3 outline
                                                          • TCP Flow Control
                                                          • TCP Flow Control
                                                          • TCP segment structure
                                                          • TCP Flow control how it works
                                                          • Technical Issue
                                                          • Chapter 3 outline
                                                          • TCP Connection Management
                                                          • TCP Connection Management (cont)
                                                          • TCP Connection Management (cont)
                                                          • TCP Connection Management (cont)
                                                          • TCP Connection Management (cont)
                                                          • A few special cases
                                                          • Chapter 3 outline
                                                          • Principles of Congestion Control
                                                          • Causescosts of congestion scenario 1
                                                          • Causescosts of congestion scenario 2
                                                          • Causescosts of congestion scenario 3
                                                          • Causescosts of congestion scenario 3
                                                          • Approaches towards congestion control
                                                          • Case study ATM ABR congestion control
                                                          • Case study ATM ABR congestion control
                                                          • Chapter 3 outline
                                                          • TCP Congestion Control
                                                          • TCP AIMD
                                                          • TCP Slow Start
                                                          • TCP Slow Start (more)
                                                          • Summary TCP Congestion Control
                                                          • The Big Picture
                                                          • TCP sender congestion control
                                                          • TCP throughput
                                                          • TCP Futures
                                                          • TCP Fairness
                                                          • Why is TCP fair
                                                          • Fairness (more)
                                                          • TCP Latency Modeling
                                                          • Fixed Congestion Window (W)
                                                          • Fixed congestion window (1)
                                                          • Fixed congestion window (2)
                                                          • TCP Latency Modeling Slow Start (1)
                                                          • TCP Latency Modeling Slow Start (2)
                                                          • TCP Latency Modeling (3)
                                                          • TCP Latency Modeling (4)
                                                          • HTTP Modeling
                                                          • Chapter 3 Summary

                                                            3 Transport Layer 30Comp 361 Spring 2005

                                                            Sender whenever sender receives control message it sends a packet to receiver

                                                            A valid ACK Sends next packet (if exists) with new sequence A NAK or corrupt response resends old packet

                                                            Receiver sends ACKNAK to senderIf received packet is corrupt send NAKIf received packet is valid and has different sequence as prevpacket send ACK and deliver new data upIf received packet is valid and has same sequence as prevpacket ie is a retransmission of duplicate send ACK

                                                            Note ACKNAK do not contain sequence

                                                            3 Transport Layer 31Comp 361 Spring 2005

                                                            rdt21 sender handles garbled ACKNAKs

                                                            Wait for call 0 from

                                                            above

                                                            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                            rdt_send(data)

                                                            Wait for ACK or NAK 0 udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                            rdt_send(data)

                                                            udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                            Wait forcall 1 from

                                                            above

                                                            Wait for ACK or NAK 1

                                                            ΛΛ

                                                            3 Transport Layer 32Comp 361 Spring 2005

                                                            rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                            ampamp has_seq0(rcvpkt)

                                                            Wait for 0 from below

                                                            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                            Wait for 1 from below

                                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                            sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                            sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                            3 Transport Layer 33Comp 361 Spring 2005

                                                            rdt21 discussion

                                                            Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                            state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                            Receivermust check if received packet is duplicate

                                                            state indicates whether 0 or 1 is expected pkt seq

                                                            note receiver can notknow if its last ACKNAK received OK at sender

                                                            3 Transport Layer 34Comp 361 Spring 2005

                                                            rdt22 a NAK-free protocol

                                                            same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                            receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                            duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                            3 Transport Layer 35Comp 361 Spring 2005

                                                            rdt22 sender receiver fragments

                                                            Wait for call 0 from

                                                            above

                                                            sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                            rdt_send(data)

                                                            udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                            isACK(rcvpkt1) )

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                            Wait for ACK

                                                            0sender FSM

                                                            fragment

                                                            Wait for 0 from below

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                            rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                            has_seq1(rcvpkt))

                                                            udt_send(sndpkt)receiver FSM

                                                            fragment

                                                            Λ

                                                            3 Transport Layer 36Comp 361 Spring 2005

                                                            rdt30 channels with errors and loss

                                                            New assumptionunderlying channel can also lose packets (data or ACKs)

                                                            checksum seq ACKs retransmissions will be of help but not enough

                                                            Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                            Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                            retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                            requires countdown timer

                                                            3 Transport Layer 37Comp 361 Spring 2005

                                                            rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                            rdt_send(data)

                                                            Wait for

                                                            ACK0

                                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                            Wait for call 1 from

                                                            above

                                                            sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                            rdt_send(data)

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                            rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                            stop_timerstop_timer

                                                            udt_send(sndpkt)start_timer

                                                            timeout

                                                            udt_send(sndpkt)start_timer

                                                            timeout

                                                            rdt_rcv(rcvpkt)

                                                            Wait for call 0from

                                                            above

                                                            Wait for

                                                            ACK1

                                                            Λrdt_rcv(rcvpkt)

                                                            ΛΛ

                                                            Λ

                                                            3 Transport Layer 38Comp 361 Spring 2005

                                                            rdt30 in action

                                                            3 Transport Layer 39Comp 361 Spring 2005

                                                            rdt30 in action

                                                            3 Transport Layer 40Comp 361 Spring 2005

                                                            Performance of rdt30

                                                            rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                            L (packet length in bits)R (transmission rate bps)

                                                            8kbpkt109 bsec

                                                            Ttransmit = = = 8 microsec

                                                            U sender =

                                                            00830008

                                                            = 000027 L R RTT + L R

                                                            =

                                                            U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                            rdt30 stop-and-wait operation

                                                            first packet bit transmitted t = 0

                                                            sender receiver

                                                            RTT

                                                            last packet bit transmitted t = L R

                                                            first packet bit arriveslast packet bit arrives send ACK

                                                            ACK arrives send next packet t = RTT + L R

                                                            U sender =

                                                            008 30008

                                                            = 000027 L R RTT + L R

                                                            =

                                                            3 Transport Layer 41Comp 361 Spring 2005

                                                            3 Transport Layer 42Comp 361 Spring 2005

                                                            Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                            range of sequence numbers must be increasedbuffering at sender andor receiver

                                                            3 Transport Layer 43Comp 361 Spring 2005

                                                            Pipelined protocols

                                                            Advantage much better bandwidth utilization than stop-and-wait

                                                            Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                            Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                            Note TCP is not exactly either

                                                            Pipelining increased utilization

                                                            first packet bit transmitted t = 0

                                                            sender receiver

                                                            RTT

                                                            last bit transmitted t = L R

                                                            first packet bit arriveslast packet bit arrives send ACK

                                                            ACK arrives send next packet t = RTT + L R

                                                            last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                            U sender =

                                                            02430008

                                                            = 00008 3 L R RTT + L R

                                                            =

                                                            Increase utilizationby a factor of 3

                                                            3 Transport Layer 44Comp 361 Spring 2005

                                                            3 Transport Layer 45Comp 361 Spring 2005

                                                            Go-Back-NSender

                                                            k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                            ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                            Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                            3 Transport Layer 46Comp 361 Spring 2005

                                                            GBN Sender

                                                            rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                            Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                            Timeout resends ALL packets that have been sent but not yet acknowledged

                                                            This is only event that triggers resend

                                                            3 Transport Layer 47Comp 361 Spring 2005

                                                            GBN sender extended FSMrdt_send(data)

                                                            Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                            timeout

                                                            if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                            start_timernextseqnum++

                                                            elserefuse_data(data)

                                                            base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                            stop_timerelse

                                                            start_timer

                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                            base=1nextseqnum=1

                                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                            Λ

                                                            3 Transport Layer 48Comp 361 Spring 2005

                                                            GBN receiver extended FSM

                                                            Wait

                                                            udt_send(sndpkt)default

                                                            rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                            expectedseqnum=1sndpkt =

                                                            make_pkt(0ACKchksum)

                                                            Λ

                                                            If expected packet receivedSend ACK and deliver packet upstairs

                                                            If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                            3 Transport Layer 49Comp 361 Spring 2005

                                                            More on receiver

                                                            The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                            3 Transport Layer 50Comp 361 Spring 2005

                                                            GBN inaction

                                                            GBN is easy to code but might have performance problems

                                                            In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                            Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                            3 Transport Layer 51Comp 361 Spring 2005

                                                            3 Transport Layer 52Comp 361 Spring 2005

                                                            Selective Repeat

                                                            receiver individually acknowledges all correctly received pkts

                                                            buffers pkts as needed for eventual in-order delivery to upper layer

                                                            sender only resends pkts for which ACK not received

                                                            sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                            sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                            3 Transport Layer 53Comp 361 Spring 2005

                                                            Selective repeat sender receiver windows

                                                            3 Transport Layer 54Comp 361 Spring 2005

                                                            Selective repeat

                                                            pkt n in [rcvbase rcvbase+N-1]

                                                            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                            pkt n in [rcvbase-Nrcvbase-1]

                                                            ACK(n) (note this is a reACK)

                                                            otherwiseignore

                                                            receiverdata from above

                                                            if next available seq in window send pkt

                                                            timeout(n)resend pkt n restart timer

                                                            ACK(n) in [sendbasesendbase+N]

                                                            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                            sender

                                                            3 Transport Layer 55Comp 361 Spring 2005

                                                            Selective repeat in action

                                                            3 Transport Layer 56Comp 361 Spring 2005

                                                            Selective repeatdilemma

                                                            Example seq rsquos 0 1 2 3window size=3

                                                            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                            Q what is relationship between seq size and window size

                                                            3 Transport Layer 57Comp 361 Spring 2005

                                                            Chapter 3 outline

                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                            35 Connection-oriented transport TCP

                                                            segment structurereliable data transferflow controlconnection management

                                                            36 Principles of congestion control37 TCP congestion control

                                                            3 Transport Layer 58Comp 361 Spring 2005

                                                            TCP Overview RFCs 793 1122 1323 2018 2581

                                                            full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                            flow controlledsender will not overwhelm receiver

                                                            point-to-pointone sender one receiver

                                                            reliable in-order byte steam

                                                            no ldquomessage boundariesrdquopipelined

                                                            TCP congestion and flow control set window size

                                                            send amp receive buffers

                                                            socketdoor

                                                            TCPsend buffer

                                                            TCPreceive buffer

                                                            socketdoor

                                                            segment

                                                            applicationwrites data

                                                            applicationreads data

                                                            3 Transport Layer 59Comp 361 Spring 2005

                                                            More TCP DetailsMaximum Segment Size (MSS)

                                                            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                            Application Data + TCP Header = TCP Segment

                                                            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                            (again no payload)Client responds with third special segment

                                                            This can contain payload

                                                            3 Transport Layer 60Comp 361 Spring 2005

                                                            Even More TCP Details

                                                            A TCP connection between client and server creates in both client and server

                                                            (i) buffers(ii) variables and

                                                            (iii) a socket connection to process

                                                            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                            any of the network elements between the host and server

                                                            3 Transport Layer 61Comp 361 Spring 2005

                                                            TCP segment structure

                                                            source port dest port

                                                            32 bits

                                                            applicationdata

                                                            (variable length)

                                                            sequence numberacknowledgement number

                                                            Receive windowUrg data pnterchecksum

                                                            FSRPAUheadlen

                                                            notused

                                                            Options (variable length)

                                                            URG urgent data (generally not used)

                                                            ACK ACK valid

                                                            PSH push data now(generally not used)

                                                            RST SYN FINconnection estab(setup teardown

                                                            commands)

                                                            bytes rcvr willingto accept

                                                            Internetchecksum

                                                            (as in UDP)

                                                            countingby bytes of data(not segments)

                                                            3 Transport Layer 62Comp 361 Spring 2005

                                                            TCP seq rsquos and ACKsSeq rsquos

                                                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                            ACKsseq of next byte expected from other sidecumulative ACK

                                                            Q how receiver handles out-of-order segments

                                                            A TCP spec doesnrsquot say - up to implementer

                                                            Host BHost A

                                                            Seq=42 ACK=79 data = lsquoCrsquo

                                                            Seq=79 ACK=43 data = lsquoCrsquo

                                                            Seq=43 ACK=80

                                                            Usertypes

                                                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                            back lsquoCrsquo

                                                            host ACKsreceipt

                                                            of echoedlsquoCrsquo

                                                            timesimple telnet scenario

                                                            3 Transport Layer 63Comp 361 Spring 2005

                                                            TCP Round Trip Time and Timeout

                                                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                            average several recent measurements not just current SampleRTT

                                                            Q how to set TCP timeout valuelonger than RTT

                                                            but RTT variestoo short premature timeout

                                                            unnecessary retransmissions

                                                            too long slow reaction to segment loss

                                                            3 Transport Layer 64Comp 361 Spring 2005

                                                            TCP Round Trip Time and Timeout

                                                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                            3 Transport Layer 65Comp 361 Spring 2005

                                                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                            100

                                                            150

                                                            200

                                                            250

                                                            300

                                                            350

                                                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                            time (seconnds)

                                                            RTT

                                                            (mill

                                                            iseco

                                                            nds)

                                                            SampleRTT Estimated RTT

                                                            3 Transport Layer 66Comp 361 Spring 2005

                                                            TCP Round Trip Time and Timeout

                                                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                            (typically β = 025)

                                                            Then set timeout interval

                                                            TimeoutInterval = EstimatedRTT + 4DevRTT

                                                            3 Transport Layer 67Comp 361 Spring 2005

                                                            Chapter 3 outline

                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                            35 Connection-oriented transport TCP

                                                            segment structurereliable data transferflow controlconnection management

                                                            36 Principles of congestion control37 TCP congestion control

                                                            3 Transport Layer 68Comp 361 Spring 2005

                                                            TCP reliable data transfer

                                                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                            Retransmissions are triggered by

                                                            timeout eventsduplicate acks

                                                            Initially consider simplified TCP sender

                                                            ignore duplicate acksignore flow control congestion control

                                                            3 Transport Layer 69Comp 361 Spring 2005

                                                            TCP sender eventsdata rcvd from app

                                                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                            timeoutretransmit segment that caused timeoutrestart timer

                                                            Ack rcvdIf acknowledges previously unackedsegments

                                                            update what is known to be ackedstart timer if there are outstanding segments

                                                            TCP sender(simplified)

                                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                            loop (forever) switch(event)

                                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                                            smallest sequence numberstart timer

                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                            start timer

                                                            end of loop forever

                                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                            3 Transport Layer 70Comp 361 Spring 2005

                                                            3 Transport Layer 71Comp 361 Spring 2005

                                                            TCP retransmission scenariosHost A

                                                            Seq=100 20 bytes data

                                                            ACK=100

                                                            timepremature timeout

                                                            Host B

                                                            Seq=92 8 bytes data

                                                            ACK=120

                                                            Seq=92 8 bytes data

                                                            Seq=

                                                            92 t

                                                            imeo

                                                            ut

                                                            ACK=120

                                                            Host A

                                                            Seq=92 8 bytes data

                                                            ACK=100

                                                            loss

                                                            tim

                                                            eout

                                                            lost ACK scenario

                                                            Host B

                                                            X

                                                            Seq=92 8 bytes data

                                                            ACK=100

                                                            time

                                                            SendBase= 120

                                                            SendBase= 120

                                                            Sendbase= 100

                                                            Seq=

                                                            92 t

                                                            imeo

                                                            utSendBase

                                                            = 100

                                                            3 Transport Layer 72Comp 361 Spring 2005

                                                            TCP retransmission scenarios (more)Host A

                                                            Seq=92 8 bytes data

                                                            ACK=100

                                                            loss

                                                            tim

                                                            eout

                                                            Cumulative ACK scenario

                                                            Host B

                                                            X

                                                            Seq=100 20 bytes data

                                                            ACK=120

                                                            time

                                                            SendBase= 120

                                                            3 Transport Layer 73Comp 361 Spring 2005

                                                            TCP ACK generation [RFC 1122 RFC 2581]

                                                            Event at Receiver

                                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                            Arrival of segment that partially or completely fills gap

                                                            TCP Receiver action

                                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                            Immediately send single cumulative ACK ACKing both in-order segments

                                                            Immediately send duplicate ACK indicating seq of next expected byte

                                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                                            3 Transport Layer 74Comp 361 Spring 2005

                                                            More on Sender Policies

                                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                            3 Transport Layer 75Comp 361 Spring 2005

                                                            Fast Retransmit

                                                            Time-out period often relatively long

                                                            long delay before resending lost packet

                                                            Detect lost segments via duplicate ACKs

                                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                            fast retransmit resend segment before timer expires

                                                            3 Transport Layer 76Comp 361 Spring 2005

                                                            Fast retransmit algorithm

                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                            start timer

                                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                            resend segment with sequence number y

                                                            a duplicate ACK for already ACKed segment

                                                            fast retransmit

                                                            3 Transport Layer 77Comp 361 Spring 2005

                                                            TCP GBN or Selective Repeat

                                                            Basic TCP looks a lot like GBN

                                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                            This looks a lot like Selective Repeat

                                                            TCP is a hybrid

                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                            Chapter 3 outline

                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                            35 Connection-oriented transport TCP

                                                            segment structurereliable data transferflow controlconnection management

                                                            36 Principles of congestion control37 TCP congestion control

                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                            TCP Flow Control

                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                            transmitting too muchtoo fast

                                                            flow controlreceive side of TCP connection has a receive buffer

                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                            app process may be slow at reading from buffer

                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                            TCP segment structure

                                                            source port dest port

                                                            32 bits

                                                            applicationdata

                                                            (variable length)

                                                            sequence numberacknowledgement number

                                                            Receive windowUrg data pnterchecksum

                                                            FSRPAUheadlen

                                                            notused

                                                            Options (variable length)

                                                            URG urgent data (generally not used)

                                                            ACK ACK valid

                                                            PSH push data now(generally not used)

                                                            RST SYN FINconnection estab(setup teardown

                                                            commands)

                                                            bytes rcvr willingto accept

                                                            Internetchecksum

                                                            (as in UDP)

                                                            countingby bytes of data(not segments)

                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                            TCP Flow control how it works

                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                            LastByteRead]

                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                            guarantees receive buffer doesnrsquot overflow

                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                            Technical Issue

                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                            Note on UDP

                                                            UDP has no flow control

                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                            Chapter 3 outline

                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                            35 Connection-oriented transport TCP

                                                            segment structurereliable data transferflow controlconnection management

                                                            36 Principles of congestion control37 TCP congestion control

                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                            TCP Connection Management

                                                            Three way handshakeStep 1 client end system sends

                                                            TCP SYN control segment to server

                                                            specifies client_isn the initial seq No application data

                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                            seq sbuffers flow control info (eg RcvWindow)

                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                            TCP Connection Management (cont)

                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                            Allocate buffersAllocates buffersCan include application data

                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                            clientConnection request (SYN=1 seq=client_isn)

                                                            server

                                                            Connection granted (SYN=1 server_isn

                                                            ACK (SYN=0 seq=client_isn+1)

                                                            ack=client_isn+1)

                                                            ack=server_isn+1

                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                            TCP Connection Management (cont)

                                                            Closing a connection

                                                            client closes socketclientSocketclose()

                                                            Step 1 client end system sends TCP FIN control segment to server

                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                            client

                                                            FIN

                                                            server

                                                            ACK

                                                            ACK

                                                            FIN

                                                            close

                                                            close

                                                            closed

                                                            tim

                                                            ed w

                                                            ait

                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                            TCP Connection Management (cont)

                                                            Step 3 client receives FIN replies with ACK

                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                            Closes down after timed-wait

                                                            Step 4 server receives ACK Connection closed

                                                            Note with small modification can handle simultaneous FINs

                                                            client

                                                            FIN

                                                            server

                                                            ACK

                                                            ACK

                                                            FIN

                                                            closing

                                                            closing

                                                            closed

                                                            tim

                                                            ed w

                                                            ait

                                                            closed

                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                            TCP Connection Management (cont)

                                                            ExampleTCP serverlifecycle

                                                            Example TCP clientlifecycle

                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                            A few special cases

                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                            Chapter 3 outline

                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                            35 Connection-oriented transport TCP

                                                            segment structurereliable data transferflow controlconnection management

                                                            36 Principles of congestion control37 TCP congestion control

                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                            Principles of Congestion Control

                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                            a top-10 problem

                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                            large delays when congestedmaximum achievable throughput

                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                            Causescosts of congestion scenario 2

                                                            one router finite buffers sender retransmission of lost packet

                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                            λin λout=

                                                            λin λoutgtλ

                                                            inλout

                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                            (c)(a) (b)

                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                            λin

                                                            Q what happens as and increase λ

                                                            in

                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                            Causescosts of congestion scenario 3

                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                            Approaches towards congestion control

                                                            Two broad approaches towards congestion control

                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                            Case study ATM ABR congestion control

                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                            RM cells returned to sender by receiver with bits intact

                                                            small exception ndash see next page

                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                            sender should use available bandwidth

                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                            Case study ATM ABR congestion control

                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                            Chapter 3 outline

                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                            35 Connection-oriented transport TCP

                                                            segment structurereliable data transferflow controlconnection management

                                                            36 Principles of congestion control37 TCP congestion control

                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                            Congwin

                                                            w segments each with MSS bytes sent in one RTT

                                                            throughput = w MSSRTT Bytessec

                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                            LastByteSent-LastByteAcked le CongWin

                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                            cut CongWin in half after loss event

                                                            8 Kbytes

                                                            16 Kbytes

                                                            24 Kbytes

                                                            time

                                                            congestionwindow

                                                            Long-lived TCP connection

                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                            TCP Slow Start

                                                            When connection begins CongWin = 1 MSS

                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                            available bandwidth may be gtgt MSSRTT

                                                            desirable to quickly ramp up to respectable rate

                                                            When connection begins increase rate exponentially fast until first loss event

                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                            TCP Slow Start (more)

                                                            When connection begins increase rate exponentially until first loss event

                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                            Summary initial rate is slow but ramps up exponentially fast

                                                            Host A

                                                            one segment

                                                            RTT

                                                            Host B

                                                            time

                                                            two segments

                                                            four segments

                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                            Summary TCP Congestion Control

                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                            The Big Picture

                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                            ACK receipt for previously unackeddata

                                                            Slow Start (SS)

                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                            set state to ldquoCongestion Avoidancerdquo

                                                            Resulting in a doubling of CongWin every RTT

                                                            ACK receipt for previously unackeddata

                                                            CongestionAvoidance (CA)

                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                            Loss event detected by triple duplicate ACK

                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                            Enter slow start

                                                            Duplicate ACK

                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                            CongWin and Threshold not changed

                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                            TCP throughput

                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                            TCP Futures

                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                            LRTTMSSsdot221

                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                            TCP connection 1

                                                            bottleneckrouter

                                                            capacity R

                                                            TCP connection 2

                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                            Why is TCP fairTwo competing sessions

                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                            R

                                                            R

                                                            equal bandwidth share

                                                            Connection 1 throughput

                                                            Conn

                                                            ecti

                                                            on 2

                                                            thr

                                                            ough

                                                            p ut

                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                            Fairness (more)Fairness and UDP

                                                            Multimedia apps often do not use TCP

                                                            do not want rate throttled by congestion control

                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                            Current Research area How to keep UDP from congesting the internet

                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                            TCP Latency ModelingNotation assumptions

                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                            modeling slow start

                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                            Fixed Congestion Window (W)Two cases

                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                            Fixed congestion window (1)

                                                            First caseWSR gt RTT + SR ACK for

                                                            first segment in window returns before windowrsquos worth of data sent

                                                            latency = 2RTT + OR

                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                            Fixed congestion window (2)

                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                            TCP Latency Modeling Slow Start (1)

                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                            Will show that the delay for one object is

                                                            RS

                                                            RSRTTP

                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                            ⎤⎢⎣⎡ +++=

                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                            - and K is the number of windows that cover the object

                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                            TCP Latency Modeling Slow Start (2)

                                                            RTT

                                                            initiate TCPconnection

                                                            requestobject

                                                            first window= SR

                                                            second window= 2SR

                                                            third window= 4SR

                                                            fourth window= 8SR

                                                            completetransmissionobject

                                                            delivered

                                                            time atclient

                                                            time atserver

                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                            Server idles P=2 times

                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                            Server idles P = minK-1Q times

                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                            TCP Latency Modeling (3)

                                                            ementacknowledg receivesserver until

                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                            RS

                                                            RSRTTPRTT

                                                            RO

                                                            RSRTT

                                                            RSRTT

                                                            RO

                                                            idleTimeRTTRO

                                                            P

                                                            kP

                                                            k

                                                            P

                                                            pp

                                                            )12(][2

                                                            ]2[2

                                                            2delay

                                                            1

                                                            1

                                                            1

                                                            minusminus+++=

                                                            minus+++=

                                                            ++=

                                                            minus

                                                            =

                                                            =

                                                            sum

                                                            sum

                                                            th window after the timeidle 2 1 kRSRTT

                                                            RS k =⎥⎦

                                                            ⎤⎢⎣⎡ minus+

                                                            +minus

                                                            window kth the transmit totime2 1 =minus

                                                            RSk

                                                            RTT

                                                            initiate TCPconnection

                                                            requestobject

                                                            first window= SR

                                                            second window= 2SR

                                                            third window= 4SR

                                                            fourth window= 8SR

                                                            completetransmissionobject

                                                            delivered

                                                            time atclient

                                                            time atserver

                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                            How do we calculate K

                                                            ⎥⎥⎤

                                                            ⎢⎢⎡ +=

                                                            +ge=

                                                            geminus=

                                                            ge+++=

                                                            ge+++=minus

                                                            minus

                                                            )1(log

                                                            )1(logmin

                                                            12min

                                                            222min222min

                                                            2

                                                            2

                                                            110

                                                            110

                                                            SO

                                                            SOkk

                                                            SOk

                                                            SOkOSSSkK

                                                            k

                                                            k

                                                            k

                                                            L

                                                            L

                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                            HTTP ModelingAssume Web page consists of

                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                            02468

                                                            101214161820

                                                            28Kbps

                                                            100Kbps

                                                            1 Mbps 10Mbps

                                                            non-persistent

                                                            persistent

                                                            parallel non-persistent

                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                            HTTP Response time (in seconds)

                                                            0

                                                            10

                                                            20

                                                            30

                                                            40

                                                            50

                                                            60

                                                            70

                                                            28Kbps

                                                            100Kbps

                                                            1 Mbps 10Mbps

                                                            non-persistent

                                                            persistent

                                                            parallel non-persistent

                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                            instantiation and implementation in the Internet

                                                            UDPTCP

                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                            • Chapter 3 Transport Layer last revised 160305
                                                            • Chapter 3 outline
                                                            • Transport services and protocols
                                                            • Transport vs network layer
                                                            • Transport-layer protocols
                                                            • Chapter 3 outline
                                                            • Multiplexingdemultiplexing
                                                            • Multiplexingdemultiplexing
                                                            • How demultiplexing works
                                                            • Connectionless demultiplexing
                                                            • Connectionless demux (cont)
                                                            • Connection-oriented demux
                                                            • Connection-oriented demux (cont)
                                                            • Connection-oriented demux Threaded Web Server
                                                            • Chapter 3 outline
                                                            • UDP User Datagram Protocol [RFC 768]
                                                            • UDP more
                                                            • UDP checksum
                                                            • Chapter 3 outline
                                                            • Principles of Reliable data transfer
                                                            • Reliable data transfer getting started
                                                            • Reliable data transfer getting started
                                                            • Incremental Improvements
                                                            • Rdt10 reliable transfer over a reliable channel
                                                            • Rdt20 channel with bit errors
                                                            • rdt20 FSM specification
                                                            • rdt20 operation with no errors
                                                            • rdt20 error scenario
                                                            • rdt20 has a fatal flaw
                                                            • rdt21 sender handles garbled ACKNAKs
                                                            • rdt21 receiver handles garbled ACKNAKs
                                                            • rdt21 discussion
                                                            • rdt22 a NAK-free protocol
                                                            • rdt22 sender receiver fragments
                                                            • rdt30 channels with errors and loss
                                                            • rdt30 sender
                                                            • rdt30 in action
                                                            • rdt30 in action
                                                            • Performance of rdt30
                                                            • rdt30 stop-and-wait operation
                                                            • Pipelined protocols
                                                            • Pipelined protocols
                                                            • Pipelining increased utilization
                                                            • Go-Back-N
                                                            • GBN Sender
                                                            • GBN sender extended FSM
                                                            • GBN receiver extended FSM
                                                            • More on receiver
                                                            • GBN inaction
                                                            • Selective Repeat
                                                            • Selective repeat sender receiver windows
                                                            • Selective repeat
                                                            • Selective repeat in action
                                                            • Selective repeat dilemma
                                                            • Chapter 3 outline
                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                            • More TCP Details
                                                            • Even More TCP Details
                                                            • TCP segment structure
                                                            • TCP seq rsquos and ACKs
                                                            • TCP Round Trip Time and Timeout
                                                            • TCP Round Trip Time and Timeout
                                                            • Example RTT estimation
                                                            • TCP Round Trip Time and Timeout
                                                            • Chapter 3 outline
                                                            • TCP reliable data transfer
                                                            • TCP sender events
                                                            • TCP sender(simplified)
                                                            • TCP retransmission scenarios
                                                            • TCP retransmission scenarios (more)
                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                            • More on Sender Policies
                                                            • Fast Retransmit
                                                            • Fast retransmit algorithm
                                                            • TCP GBN or Selective Repeat
                                                            • Chapter 3 outline
                                                            • TCP Flow Control
                                                            • TCP Flow Control
                                                            • TCP segment structure
                                                            • TCP Flow control how it works
                                                            • Technical Issue
                                                            • Chapter 3 outline
                                                            • TCP Connection Management
                                                            • TCP Connection Management (cont)
                                                            • TCP Connection Management (cont)
                                                            • TCP Connection Management (cont)
                                                            • TCP Connection Management (cont)
                                                            • A few special cases
                                                            • Chapter 3 outline
                                                            • Principles of Congestion Control
                                                            • Causescosts of congestion scenario 1
                                                            • Causescosts of congestion scenario 2
                                                            • Causescosts of congestion scenario 3
                                                            • Causescosts of congestion scenario 3
                                                            • Approaches towards congestion control
                                                            • Case study ATM ABR congestion control
                                                            • Case study ATM ABR congestion control
                                                            • Chapter 3 outline
                                                            • TCP Congestion Control
                                                            • TCP AIMD
                                                            • TCP Slow Start
                                                            • TCP Slow Start (more)
                                                            • Summary TCP Congestion Control
                                                            • The Big Picture
                                                            • TCP sender congestion control
                                                            • TCP throughput
                                                            • TCP Futures
                                                            • TCP Fairness
                                                            • Why is TCP fair
                                                            • Fairness (more)
                                                            • TCP Latency Modeling
                                                            • Fixed Congestion Window (W)
                                                            • Fixed congestion window (1)
                                                            • Fixed congestion window (2)
                                                            • TCP Latency Modeling Slow Start (1)
                                                            • TCP Latency Modeling Slow Start (2)
                                                            • TCP Latency Modeling (3)
                                                            • TCP Latency Modeling (4)
                                                            • HTTP Modeling
                                                            • Chapter 3 Summary

                                                              3 Transport Layer 31Comp 361 Spring 2005

                                                              rdt21 sender handles garbled ACKNAKs

                                                              Wait for call 0 from

                                                              above

                                                              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                              rdt_send(data)

                                                              Wait for ACK or NAK 0 udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)

                                                              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)

                                                              rdt_send(data)

                                                              udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

                                                              Wait forcall 1 from

                                                              above

                                                              Wait for ACK or NAK 1

                                                              ΛΛ

                                                              3 Transport Layer 32Comp 361 Spring 2005

                                                              rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                              ampamp has_seq0(rcvpkt)

                                                              Wait for 0 from below

                                                              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                              Wait for 1 from below

                                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                              sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                              sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                              3 Transport Layer 33Comp 361 Spring 2005

                                                              rdt21 discussion

                                                              Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                              state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                              Receivermust check if received packet is duplicate

                                                              state indicates whether 0 or 1 is expected pkt seq

                                                              note receiver can notknow if its last ACKNAK received OK at sender

                                                              3 Transport Layer 34Comp 361 Spring 2005

                                                              rdt22 a NAK-free protocol

                                                              same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                              receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                              duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                              3 Transport Layer 35Comp 361 Spring 2005

                                                              rdt22 sender receiver fragments

                                                              Wait for call 0 from

                                                              above

                                                              sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                              rdt_send(data)

                                                              udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                              isACK(rcvpkt1) )

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                              Wait for ACK

                                                              0sender FSM

                                                              fragment

                                                              Wait for 0 from below

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                              rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                              has_seq1(rcvpkt))

                                                              udt_send(sndpkt)receiver FSM

                                                              fragment

                                                              Λ

                                                              3 Transport Layer 36Comp 361 Spring 2005

                                                              rdt30 channels with errors and loss

                                                              New assumptionunderlying channel can also lose packets (data or ACKs)

                                                              checksum seq ACKs retransmissions will be of help but not enough

                                                              Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                              Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                              retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                              requires countdown timer

                                                              3 Transport Layer 37Comp 361 Spring 2005

                                                              rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                              rdt_send(data)

                                                              Wait for

                                                              ACK0

                                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                              Wait for call 1 from

                                                              above

                                                              sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                              rdt_send(data)

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                              rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                              stop_timerstop_timer

                                                              udt_send(sndpkt)start_timer

                                                              timeout

                                                              udt_send(sndpkt)start_timer

                                                              timeout

                                                              rdt_rcv(rcvpkt)

                                                              Wait for call 0from

                                                              above

                                                              Wait for

                                                              ACK1

                                                              Λrdt_rcv(rcvpkt)

                                                              ΛΛ

                                                              Λ

                                                              3 Transport Layer 38Comp 361 Spring 2005

                                                              rdt30 in action

                                                              3 Transport Layer 39Comp 361 Spring 2005

                                                              rdt30 in action

                                                              3 Transport Layer 40Comp 361 Spring 2005

                                                              Performance of rdt30

                                                              rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                              L (packet length in bits)R (transmission rate bps)

                                                              8kbpkt109 bsec

                                                              Ttransmit = = = 8 microsec

                                                              U sender =

                                                              00830008

                                                              = 000027 L R RTT + L R

                                                              =

                                                              U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                              rdt30 stop-and-wait operation

                                                              first packet bit transmitted t = 0

                                                              sender receiver

                                                              RTT

                                                              last packet bit transmitted t = L R

                                                              first packet bit arriveslast packet bit arrives send ACK

                                                              ACK arrives send next packet t = RTT + L R

                                                              U sender =

                                                              008 30008

                                                              = 000027 L R RTT + L R

                                                              =

                                                              3 Transport Layer 41Comp 361 Spring 2005

                                                              3 Transport Layer 42Comp 361 Spring 2005

                                                              Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                              range of sequence numbers must be increasedbuffering at sender andor receiver

                                                              3 Transport Layer 43Comp 361 Spring 2005

                                                              Pipelined protocols

                                                              Advantage much better bandwidth utilization than stop-and-wait

                                                              Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                              Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                              Note TCP is not exactly either

                                                              Pipelining increased utilization

                                                              first packet bit transmitted t = 0

                                                              sender receiver

                                                              RTT

                                                              last bit transmitted t = L R

                                                              first packet bit arriveslast packet bit arrives send ACK

                                                              ACK arrives send next packet t = RTT + L R

                                                              last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                              U sender =

                                                              02430008

                                                              = 00008 3 L R RTT + L R

                                                              =

                                                              Increase utilizationby a factor of 3

                                                              3 Transport Layer 44Comp 361 Spring 2005

                                                              3 Transport Layer 45Comp 361 Spring 2005

                                                              Go-Back-NSender

                                                              k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                              ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                              Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                              3 Transport Layer 46Comp 361 Spring 2005

                                                              GBN Sender

                                                              rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                              Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                              Timeout resends ALL packets that have been sent but not yet acknowledged

                                                              This is only event that triggers resend

                                                              3 Transport Layer 47Comp 361 Spring 2005

                                                              GBN sender extended FSMrdt_send(data)

                                                              Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                              timeout

                                                              if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                              start_timernextseqnum++

                                                              elserefuse_data(data)

                                                              base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                              stop_timerelse

                                                              start_timer

                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                              base=1nextseqnum=1

                                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                              Λ

                                                              3 Transport Layer 48Comp 361 Spring 2005

                                                              GBN receiver extended FSM

                                                              Wait

                                                              udt_send(sndpkt)default

                                                              rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                              expectedseqnum=1sndpkt =

                                                              make_pkt(0ACKchksum)

                                                              Λ

                                                              If expected packet receivedSend ACK and deliver packet upstairs

                                                              If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                              3 Transport Layer 49Comp 361 Spring 2005

                                                              More on receiver

                                                              The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                              3 Transport Layer 50Comp 361 Spring 2005

                                                              GBN inaction

                                                              GBN is easy to code but might have performance problems

                                                              In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                              Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                              3 Transport Layer 51Comp 361 Spring 2005

                                                              3 Transport Layer 52Comp 361 Spring 2005

                                                              Selective Repeat

                                                              receiver individually acknowledges all correctly received pkts

                                                              buffers pkts as needed for eventual in-order delivery to upper layer

                                                              sender only resends pkts for which ACK not received

                                                              sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                              sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                              3 Transport Layer 53Comp 361 Spring 2005

                                                              Selective repeat sender receiver windows

                                                              3 Transport Layer 54Comp 361 Spring 2005

                                                              Selective repeat

                                                              pkt n in [rcvbase rcvbase+N-1]

                                                              send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                              pkt n in [rcvbase-Nrcvbase-1]

                                                              ACK(n) (note this is a reACK)

                                                              otherwiseignore

                                                              receiverdata from above

                                                              if next available seq in window send pkt

                                                              timeout(n)resend pkt n restart timer

                                                              ACK(n) in [sendbasesendbase+N]

                                                              mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                              sender

                                                              3 Transport Layer 55Comp 361 Spring 2005

                                                              Selective repeat in action

                                                              3 Transport Layer 56Comp 361 Spring 2005

                                                              Selective repeatdilemma

                                                              Example seq rsquos 0 1 2 3window size=3

                                                              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                              Q what is relationship between seq size and window size

                                                              3 Transport Layer 57Comp 361 Spring 2005

                                                              Chapter 3 outline

                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                              35 Connection-oriented transport TCP

                                                              segment structurereliable data transferflow controlconnection management

                                                              36 Principles of congestion control37 TCP congestion control

                                                              3 Transport Layer 58Comp 361 Spring 2005

                                                              TCP Overview RFCs 793 1122 1323 2018 2581

                                                              full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                              flow controlledsender will not overwhelm receiver

                                                              point-to-pointone sender one receiver

                                                              reliable in-order byte steam

                                                              no ldquomessage boundariesrdquopipelined

                                                              TCP congestion and flow control set window size

                                                              send amp receive buffers

                                                              socketdoor

                                                              TCPsend buffer

                                                              TCPreceive buffer

                                                              socketdoor

                                                              segment

                                                              applicationwrites data

                                                              applicationreads data

                                                              3 Transport Layer 59Comp 361 Spring 2005

                                                              More TCP DetailsMaximum Segment Size (MSS)

                                                              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                              Application Data + TCP Header = TCP Segment

                                                              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                              (again no payload)Client responds with third special segment

                                                              This can contain payload

                                                              3 Transport Layer 60Comp 361 Spring 2005

                                                              Even More TCP Details

                                                              A TCP connection between client and server creates in both client and server

                                                              (i) buffers(ii) variables and

                                                              (iii) a socket connection to process

                                                              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                              any of the network elements between the host and server

                                                              3 Transport Layer 61Comp 361 Spring 2005

                                                              TCP segment structure

                                                              source port dest port

                                                              32 bits

                                                              applicationdata

                                                              (variable length)

                                                              sequence numberacknowledgement number

                                                              Receive windowUrg data pnterchecksum

                                                              FSRPAUheadlen

                                                              notused

                                                              Options (variable length)

                                                              URG urgent data (generally not used)

                                                              ACK ACK valid

                                                              PSH push data now(generally not used)

                                                              RST SYN FINconnection estab(setup teardown

                                                              commands)

                                                              bytes rcvr willingto accept

                                                              Internetchecksum

                                                              (as in UDP)

                                                              countingby bytes of data(not segments)

                                                              3 Transport Layer 62Comp 361 Spring 2005

                                                              TCP seq rsquos and ACKsSeq rsquos

                                                              byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                              ACKsseq of next byte expected from other sidecumulative ACK

                                                              Q how receiver handles out-of-order segments

                                                              A TCP spec doesnrsquot say - up to implementer

                                                              Host BHost A

                                                              Seq=42 ACK=79 data = lsquoCrsquo

                                                              Seq=79 ACK=43 data = lsquoCrsquo

                                                              Seq=43 ACK=80

                                                              Usertypes

                                                              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                              back lsquoCrsquo

                                                              host ACKsreceipt

                                                              of echoedlsquoCrsquo

                                                              timesimple telnet scenario

                                                              3 Transport Layer 63Comp 361 Spring 2005

                                                              TCP Round Trip Time and Timeout

                                                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                              average several recent measurements not just current SampleRTT

                                                              Q how to set TCP timeout valuelonger than RTT

                                                              but RTT variestoo short premature timeout

                                                              unnecessary retransmissions

                                                              too long slow reaction to segment loss

                                                              3 Transport Layer 64Comp 361 Spring 2005

                                                              TCP Round Trip Time and Timeout

                                                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                              3 Transport Layer 65Comp 361 Spring 2005

                                                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                              100

                                                              150

                                                              200

                                                              250

                                                              300

                                                              350

                                                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                              time (seconnds)

                                                              RTT

                                                              (mill

                                                              iseco

                                                              nds)

                                                              SampleRTT Estimated RTT

                                                              3 Transport Layer 66Comp 361 Spring 2005

                                                              TCP Round Trip Time and Timeout

                                                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                              (typically β = 025)

                                                              Then set timeout interval

                                                              TimeoutInterval = EstimatedRTT + 4DevRTT

                                                              3 Transport Layer 67Comp 361 Spring 2005

                                                              Chapter 3 outline

                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                              35 Connection-oriented transport TCP

                                                              segment structurereliable data transferflow controlconnection management

                                                              36 Principles of congestion control37 TCP congestion control

                                                              3 Transport Layer 68Comp 361 Spring 2005

                                                              TCP reliable data transfer

                                                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                              Retransmissions are triggered by

                                                              timeout eventsduplicate acks

                                                              Initially consider simplified TCP sender

                                                              ignore duplicate acksignore flow control congestion control

                                                              3 Transport Layer 69Comp 361 Spring 2005

                                                              TCP sender eventsdata rcvd from app

                                                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                              timeoutretransmit segment that caused timeoutrestart timer

                                                              Ack rcvdIf acknowledges previously unackedsegments

                                                              update what is known to be ackedstart timer if there are outstanding segments

                                                              TCP sender(simplified)

                                                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                              loop (forever) switch(event)

                                                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                              event timer timeoutretransmit not-yet-acknowledged segment with

                                                              smallest sequence numberstart timer

                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                              start timer

                                                              end of loop forever

                                                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                              3 Transport Layer 70Comp 361 Spring 2005

                                                              3 Transport Layer 71Comp 361 Spring 2005

                                                              TCP retransmission scenariosHost A

                                                              Seq=100 20 bytes data

                                                              ACK=100

                                                              timepremature timeout

                                                              Host B

                                                              Seq=92 8 bytes data

                                                              ACK=120

                                                              Seq=92 8 bytes data

                                                              Seq=

                                                              92 t

                                                              imeo

                                                              ut

                                                              ACK=120

                                                              Host A

                                                              Seq=92 8 bytes data

                                                              ACK=100

                                                              loss

                                                              tim

                                                              eout

                                                              lost ACK scenario

                                                              Host B

                                                              X

                                                              Seq=92 8 bytes data

                                                              ACK=100

                                                              time

                                                              SendBase= 120

                                                              SendBase= 120

                                                              Sendbase= 100

                                                              Seq=

                                                              92 t

                                                              imeo

                                                              utSendBase

                                                              = 100

                                                              3 Transport Layer 72Comp 361 Spring 2005

                                                              TCP retransmission scenarios (more)Host A

                                                              Seq=92 8 bytes data

                                                              ACK=100

                                                              loss

                                                              tim

                                                              eout

                                                              Cumulative ACK scenario

                                                              Host B

                                                              X

                                                              Seq=100 20 bytes data

                                                              ACK=120

                                                              time

                                                              SendBase= 120

                                                              3 Transport Layer 73Comp 361 Spring 2005

                                                              TCP ACK generation [RFC 1122 RFC 2581]

                                                              Event at Receiver

                                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                              Arrival of segment that partially or completely fills gap

                                                              TCP Receiver action

                                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                              Immediately send single cumulative ACK ACKing both in-order segments

                                                              Immediately send duplicate ACK indicating seq of next expected byte

                                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                                              3 Transport Layer 74Comp 361 Spring 2005

                                                              More on Sender Policies

                                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                              3 Transport Layer 75Comp 361 Spring 2005

                                                              Fast Retransmit

                                                              Time-out period often relatively long

                                                              long delay before resending lost packet

                                                              Detect lost segments via duplicate ACKs

                                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                              fast retransmit resend segment before timer expires

                                                              3 Transport Layer 76Comp 361 Spring 2005

                                                              Fast retransmit algorithm

                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                              start timer

                                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                              resend segment with sequence number y

                                                              a duplicate ACK for already ACKed segment

                                                              fast retransmit

                                                              3 Transport Layer 77Comp 361 Spring 2005

                                                              TCP GBN or Selective Repeat

                                                              Basic TCP looks a lot like GBN

                                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                              This looks a lot like Selective Repeat

                                                              TCP is a hybrid

                                                              3 Transport Layer 78Comp 361 Spring 2005

                                                              Chapter 3 outline

                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                              35 Connection-oriented transport TCP

                                                              segment structurereliable data transferflow controlconnection management

                                                              36 Principles of congestion control37 TCP congestion control

                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                              TCP Flow Control

                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                              transmitting too muchtoo fast

                                                              flow controlreceive side of TCP connection has a receive buffer

                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                              app process may be slow at reading from buffer

                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                              TCP segment structure

                                                              source port dest port

                                                              32 bits

                                                              applicationdata

                                                              (variable length)

                                                              sequence numberacknowledgement number

                                                              Receive windowUrg data pnterchecksum

                                                              FSRPAUheadlen

                                                              notused

                                                              Options (variable length)

                                                              URG urgent data (generally not used)

                                                              ACK ACK valid

                                                              PSH push data now(generally not used)

                                                              RST SYN FINconnection estab(setup teardown

                                                              commands)

                                                              bytes rcvr willingto accept

                                                              Internetchecksum

                                                              (as in UDP)

                                                              countingby bytes of data(not segments)

                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                              TCP Flow control how it works

                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                              LastByteRead]

                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                              guarantees receive buffer doesnrsquot overflow

                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                              Technical Issue

                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                              Note on UDP

                                                              UDP has no flow control

                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                              Chapter 3 outline

                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                              35 Connection-oriented transport TCP

                                                              segment structurereliable data transferflow controlconnection management

                                                              36 Principles of congestion control37 TCP congestion control

                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                              TCP Connection Management

                                                              Three way handshakeStep 1 client end system sends

                                                              TCP SYN control segment to server

                                                              specifies client_isn the initial seq No application data

                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                              seq sbuffers flow control info (eg RcvWindow)

                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                              TCP Connection Management (cont)

                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                              Allocate buffersAllocates buffersCan include application data

                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                              clientConnection request (SYN=1 seq=client_isn)

                                                              server

                                                              Connection granted (SYN=1 server_isn

                                                              ACK (SYN=0 seq=client_isn+1)

                                                              ack=client_isn+1)

                                                              ack=server_isn+1

                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                              TCP Connection Management (cont)

                                                              Closing a connection

                                                              client closes socketclientSocketclose()

                                                              Step 1 client end system sends TCP FIN control segment to server

                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                              client

                                                              FIN

                                                              server

                                                              ACK

                                                              ACK

                                                              FIN

                                                              close

                                                              close

                                                              closed

                                                              tim

                                                              ed w

                                                              ait

                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                              TCP Connection Management (cont)

                                                              Step 3 client receives FIN replies with ACK

                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                              Closes down after timed-wait

                                                              Step 4 server receives ACK Connection closed

                                                              Note with small modification can handle simultaneous FINs

                                                              client

                                                              FIN

                                                              server

                                                              ACK

                                                              ACK

                                                              FIN

                                                              closing

                                                              closing

                                                              closed

                                                              tim

                                                              ed w

                                                              ait

                                                              closed

                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                              TCP Connection Management (cont)

                                                              ExampleTCP serverlifecycle

                                                              Example TCP clientlifecycle

                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                              A few special cases

                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                              Chapter 3 outline

                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                              35 Connection-oriented transport TCP

                                                              segment structurereliable data transferflow controlconnection management

                                                              36 Principles of congestion control37 TCP congestion control

                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                              Principles of Congestion Control

                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                              a top-10 problem

                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                              large delays when congestedmaximum achievable throughput

                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                              Causescosts of congestion scenario 2

                                                              one router finite buffers sender retransmission of lost packet

                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                              λin λout=

                                                              λin λoutgtλ

                                                              inλout

                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                              (c)(a) (b)

                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                              λin

                                                              Q what happens as and increase λ

                                                              in

                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                              Causescosts of congestion scenario 3

                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                              Approaches towards congestion control

                                                              Two broad approaches towards congestion control

                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                              Case study ATM ABR congestion control

                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                              RM cells returned to sender by receiver with bits intact

                                                              small exception ndash see next page

                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                              sender should use available bandwidth

                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                              Case study ATM ABR congestion control

                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                              Chapter 3 outline

                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                              35 Connection-oriented transport TCP

                                                              segment structurereliable data transferflow controlconnection management

                                                              36 Principles of congestion control37 TCP congestion control

                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                              Congwin

                                                              w segments each with MSS bytes sent in one RTT

                                                              throughput = w MSSRTT Bytessec

                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                              LastByteSent-LastByteAcked le CongWin

                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                              cut CongWin in half after loss event

                                                              8 Kbytes

                                                              16 Kbytes

                                                              24 Kbytes

                                                              time

                                                              congestionwindow

                                                              Long-lived TCP connection

                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                              TCP Slow Start

                                                              When connection begins CongWin = 1 MSS

                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                              available bandwidth may be gtgt MSSRTT

                                                              desirable to quickly ramp up to respectable rate

                                                              When connection begins increase rate exponentially fast until first loss event

                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                              TCP Slow Start (more)

                                                              When connection begins increase rate exponentially until first loss event

                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                              Summary initial rate is slow but ramps up exponentially fast

                                                              Host A

                                                              one segment

                                                              RTT

                                                              Host B

                                                              time

                                                              two segments

                                                              four segments

                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                              Summary TCP Congestion Control

                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                              The Big Picture

                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                              ACK receipt for previously unackeddata

                                                              Slow Start (SS)

                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                              set state to ldquoCongestion Avoidancerdquo

                                                              Resulting in a doubling of CongWin every RTT

                                                              ACK receipt for previously unackeddata

                                                              CongestionAvoidance (CA)

                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                              Loss event detected by triple duplicate ACK

                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                              Enter slow start

                                                              Duplicate ACK

                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                              CongWin and Threshold not changed

                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                              TCP throughput

                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                              TCP Futures

                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                              LRTTMSSsdot221

                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                              TCP connection 1

                                                              bottleneckrouter

                                                              capacity R

                                                              TCP connection 2

                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                              Why is TCP fairTwo competing sessions

                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                              R

                                                              R

                                                              equal bandwidth share

                                                              Connection 1 throughput

                                                              Conn

                                                              ecti

                                                              on 2

                                                              thr

                                                              ough

                                                              p ut

                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                              Fairness (more)Fairness and UDP

                                                              Multimedia apps often do not use TCP

                                                              do not want rate throttled by congestion control

                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                              Current Research area How to keep UDP from congesting the internet

                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                              TCP Latency ModelingNotation assumptions

                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                              modeling slow start

                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                              Fixed Congestion Window (W)Two cases

                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                              Fixed congestion window (1)

                                                              First caseWSR gt RTT + SR ACK for

                                                              first segment in window returns before windowrsquos worth of data sent

                                                              latency = 2RTT + OR

                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                              Fixed congestion window (2)

                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                              TCP Latency Modeling Slow Start (1)

                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                              Will show that the delay for one object is

                                                              RS

                                                              RSRTTP

                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                              ⎤⎢⎣⎡ +++=

                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                              - and K is the number of windows that cover the object

                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                              TCP Latency Modeling Slow Start (2)

                                                              RTT

                                                              initiate TCPconnection

                                                              requestobject

                                                              first window= SR

                                                              second window= 2SR

                                                              third window= 4SR

                                                              fourth window= 8SR

                                                              completetransmissionobject

                                                              delivered

                                                              time atclient

                                                              time atserver

                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                              Server idles P=2 times

                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                              Server idles P = minK-1Q times

                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                              TCP Latency Modeling (3)

                                                              ementacknowledg receivesserver until

                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                              RS

                                                              RSRTTPRTT

                                                              RO

                                                              RSRTT

                                                              RSRTT

                                                              RO

                                                              idleTimeRTTRO

                                                              P

                                                              kP

                                                              k

                                                              P

                                                              pp

                                                              )12(][2

                                                              ]2[2

                                                              2delay

                                                              1

                                                              1

                                                              1

                                                              minusminus+++=

                                                              minus+++=

                                                              ++=

                                                              minus

                                                              =

                                                              =

                                                              sum

                                                              sum

                                                              th window after the timeidle 2 1 kRSRTT

                                                              RS k =⎥⎦

                                                              ⎤⎢⎣⎡ minus+

                                                              +minus

                                                              window kth the transmit totime2 1 =minus

                                                              RSk

                                                              RTT

                                                              initiate TCPconnection

                                                              requestobject

                                                              first window= SR

                                                              second window= 2SR

                                                              third window= 4SR

                                                              fourth window= 8SR

                                                              completetransmissionobject

                                                              delivered

                                                              time atclient

                                                              time atserver

                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                              How do we calculate K

                                                              ⎥⎥⎤

                                                              ⎢⎢⎡ +=

                                                              +ge=

                                                              geminus=

                                                              ge+++=

                                                              ge+++=minus

                                                              minus

                                                              )1(log

                                                              )1(logmin

                                                              12min

                                                              222min222min

                                                              2

                                                              2

                                                              110

                                                              110

                                                              SO

                                                              SOkk

                                                              SOk

                                                              SOkOSSSkK

                                                              k

                                                              k

                                                              k

                                                              L

                                                              L

                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                              HTTP ModelingAssume Web page consists of

                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                              02468

                                                              101214161820

                                                              28Kbps

                                                              100Kbps

                                                              1 Mbps 10Mbps

                                                              non-persistent

                                                              persistent

                                                              parallel non-persistent

                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                              HTTP Response time (in seconds)

                                                              0

                                                              10

                                                              20

                                                              30

                                                              40

                                                              50

                                                              60

                                                              70

                                                              28Kbps

                                                              100Kbps

                                                              1 Mbps 10Mbps

                                                              non-persistent

                                                              persistent

                                                              parallel non-persistent

                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                              instantiation and implementation in the Internet

                                                              UDPTCP

                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                              • Chapter 3 Transport Layer last revised 160305
                                                              • Chapter 3 outline
                                                              • Transport services and protocols
                                                              • Transport vs network layer
                                                              • Transport-layer protocols
                                                              • Chapter 3 outline
                                                              • Multiplexingdemultiplexing
                                                              • Multiplexingdemultiplexing
                                                              • How demultiplexing works
                                                              • Connectionless demultiplexing
                                                              • Connectionless demux (cont)
                                                              • Connection-oriented demux
                                                              • Connection-oriented demux (cont)
                                                              • Connection-oriented demux Threaded Web Server
                                                              • Chapter 3 outline
                                                              • UDP User Datagram Protocol [RFC 768]
                                                              • UDP more
                                                              • UDP checksum
                                                              • Chapter 3 outline
                                                              • Principles of Reliable data transfer
                                                              • Reliable data transfer getting started
                                                              • Reliable data transfer getting started
                                                              • Incremental Improvements
                                                              • Rdt10 reliable transfer over a reliable channel
                                                              • Rdt20 channel with bit errors
                                                              • rdt20 FSM specification
                                                              • rdt20 operation with no errors
                                                              • rdt20 error scenario
                                                              • rdt20 has a fatal flaw
                                                              • rdt21 sender handles garbled ACKNAKs
                                                              • rdt21 receiver handles garbled ACKNAKs
                                                              • rdt21 discussion
                                                              • rdt22 a NAK-free protocol
                                                              • rdt22 sender receiver fragments
                                                              • rdt30 channels with errors and loss
                                                              • rdt30 sender
                                                              • rdt30 in action
                                                              • rdt30 in action
                                                              • Performance of rdt30
                                                              • rdt30 stop-and-wait operation
                                                              • Pipelined protocols
                                                              • Pipelined protocols
                                                              • Pipelining increased utilization
                                                              • Go-Back-N
                                                              • GBN Sender
                                                              • GBN sender extended FSM
                                                              • GBN receiver extended FSM
                                                              • More on receiver
                                                              • GBN inaction
                                                              • Selective Repeat
                                                              • Selective repeat sender receiver windows
                                                              • Selective repeat
                                                              • Selective repeat in action
                                                              • Selective repeat dilemma
                                                              • Chapter 3 outline
                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                              • More TCP Details
                                                              • Even More TCP Details
                                                              • TCP segment structure
                                                              • TCP seq rsquos and ACKs
                                                              • TCP Round Trip Time and Timeout
                                                              • TCP Round Trip Time and Timeout
                                                              • Example RTT estimation
                                                              • TCP Round Trip Time and Timeout
                                                              • Chapter 3 outline
                                                              • TCP reliable data transfer
                                                              • TCP sender events
                                                              • TCP sender(simplified)
                                                              • TCP retransmission scenarios
                                                              • TCP retransmission scenarios (more)
                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                              • More on Sender Policies
                                                              • Fast Retransmit
                                                              • Fast retransmit algorithm
                                                              • TCP GBN or Selective Repeat
                                                              • Chapter 3 outline
                                                              • TCP Flow Control
                                                              • TCP Flow Control
                                                              • TCP segment structure
                                                              • TCP Flow control how it works
                                                              • Technical Issue
                                                              • Chapter 3 outline
                                                              • TCP Connection Management
                                                              • TCP Connection Management (cont)
                                                              • TCP Connection Management (cont)
                                                              • TCP Connection Management (cont)
                                                              • TCP Connection Management (cont)
                                                              • A few special cases
                                                              • Chapter 3 outline
                                                              • Principles of Congestion Control
                                                              • Causescosts of congestion scenario 1
                                                              • Causescosts of congestion scenario 2
                                                              • Causescosts of congestion scenario 3
                                                              • Causescosts of congestion scenario 3
                                                              • Approaches towards congestion control
                                                              • Case study ATM ABR congestion control
                                                              • Case study ATM ABR congestion control
                                                              • Chapter 3 outline
                                                              • TCP Congestion Control
                                                              • TCP AIMD
                                                              • TCP Slow Start
                                                              • TCP Slow Start (more)
                                                              • Summary TCP Congestion Control
                                                              • The Big Picture
                                                              • TCP sender congestion control
                                                              • TCP throughput
                                                              • TCP Futures
                                                              • TCP Fairness
                                                              • Why is TCP fair
                                                              • Fairness (more)
                                                              • TCP Latency Modeling
                                                              • Fixed Congestion Window (W)
                                                              • Fixed congestion window (1)
                                                              • Fixed congestion window (2)
                                                              • TCP Latency Modeling Slow Start (1)
                                                              • TCP Latency Modeling Slow Start (2)
                                                              • TCP Latency Modeling (3)
                                                              • TCP Latency Modeling (4)
                                                              • HTTP Modeling
                                                              • Chapter 3 Summary

                                                                3 Transport Layer 32Comp 361 Spring 2005

                                                                rdt21 receiver handles garbled ACKNAKsrdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                ampamp has_seq0(rcvpkt)

                                                                Wait for 0 from below

                                                                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq0(rcvpkt)

                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                                Wait for 1 from below

                                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                                rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamphas_seq1(rcvpkt)

                                                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)

                                                                sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                                sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)

                                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)

                                                                3 Transport Layer 33Comp 361 Spring 2005

                                                                rdt21 discussion

                                                                Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                                state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                                Receivermust check if received packet is duplicate

                                                                state indicates whether 0 or 1 is expected pkt seq

                                                                note receiver can notknow if its last ACKNAK received OK at sender

                                                                3 Transport Layer 34Comp 361 Spring 2005

                                                                rdt22 a NAK-free protocol

                                                                same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                                receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                                duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                                3 Transport Layer 35Comp 361 Spring 2005

                                                                rdt22 sender receiver fragments

                                                                Wait for call 0 from

                                                                above

                                                                sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                                rdt_send(data)

                                                                udt_send(sndpkt)

                                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                                isACK(rcvpkt1) )

                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                Wait for ACK

                                                                0sender FSM

                                                                fragment

                                                                Wait for 0 from below

                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                                rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                                has_seq1(rcvpkt))

                                                                udt_send(sndpkt)receiver FSM

                                                                fragment

                                                                Λ

                                                                3 Transport Layer 36Comp 361 Spring 2005

                                                                rdt30 channels with errors and loss

                                                                New assumptionunderlying channel can also lose packets (data or ACKs)

                                                                checksum seq ACKs retransmissions will be of help but not enough

                                                                Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                                Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                                retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                                requires countdown timer

                                                                3 Transport Layer 37Comp 361 Spring 2005

                                                                rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                                rdt_send(data)

                                                                Wait for

                                                                ACK0

                                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                                Wait for call 1 from

                                                                above

                                                                sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                                rdt_send(data)

                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                                stop_timerstop_timer

                                                                udt_send(sndpkt)start_timer

                                                                timeout

                                                                udt_send(sndpkt)start_timer

                                                                timeout

                                                                rdt_rcv(rcvpkt)

                                                                Wait for call 0from

                                                                above

                                                                Wait for

                                                                ACK1

                                                                Λrdt_rcv(rcvpkt)

                                                                ΛΛ

                                                                Λ

                                                                3 Transport Layer 38Comp 361 Spring 2005

                                                                rdt30 in action

                                                                3 Transport Layer 39Comp 361 Spring 2005

                                                                rdt30 in action

                                                                3 Transport Layer 40Comp 361 Spring 2005

                                                                Performance of rdt30

                                                                rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                L (packet length in bits)R (transmission rate bps)

                                                                8kbpkt109 bsec

                                                                Ttransmit = = = 8 microsec

                                                                U sender =

                                                                00830008

                                                                = 000027 L R RTT + L R

                                                                =

                                                                U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                rdt30 stop-and-wait operation

                                                                first packet bit transmitted t = 0

                                                                sender receiver

                                                                RTT

                                                                last packet bit transmitted t = L R

                                                                first packet bit arriveslast packet bit arrives send ACK

                                                                ACK arrives send next packet t = RTT + L R

                                                                U sender =

                                                                008 30008

                                                                = 000027 L R RTT + L R

                                                                =

                                                                3 Transport Layer 41Comp 361 Spring 2005

                                                                3 Transport Layer 42Comp 361 Spring 2005

                                                                Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                3 Transport Layer 43Comp 361 Spring 2005

                                                                Pipelined protocols

                                                                Advantage much better bandwidth utilization than stop-and-wait

                                                                Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                Note TCP is not exactly either

                                                                Pipelining increased utilization

                                                                first packet bit transmitted t = 0

                                                                sender receiver

                                                                RTT

                                                                last bit transmitted t = L R

                                                                first packet bit arriveslast packet bit arrives send ACK

                                                                ACK arrives send next packet t = RTT + L R

                                                                last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                U sender =

                                                                02430008

                                                                = 00008 3 L R RTT + L R

                                                                =

                                                                Increase utilizationby a factor of 3

                                                                3 Transport Layer 44Comp 361 Spring 2005

                                                                3 Transport Layer 45Comp 361 Spring 2005

                                                                Go-Back-NSender

                                                                k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                3 Transport Layer 46Comp 361 Spring 2005

                                                                GBN Sender

                                                                rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                This is only event that triggers resend

                                                                3 Transport Layer 47Comp 361 Spring 2005

                                                                GBN sender extended FSMrdt_send(data)

                                                                Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                timeout

                                                                if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                start_timernextseqnum++

                                                                elserefuse_data(data)

                                                                base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                stop_timerelse

                                                                start_timer

                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                base=1nextseqnum=1

                                                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                Λ

                                                                3 Transport Layer 48Comp 361 Spring 2005

                                                                GBN receiver extended FSM

                                                                Wait

                                                                udt_send(sndpkt)default

                                                                rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                expectedseqnum=1sndpkt =

                                                                make_pkt(0ACKchksum)

                                                                Λ

                                                                If expected packet receivedSend ACK and deliver packet upstairs

                                                                If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                3 Transport Layer 49Comp 361 Spring 2005

                                                                More on receiver

                                                                The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                3 Transport Layer 50Comp 361 Spring 2005

                                                                GBN inaction

                                                                GBN is easy to code but might have performance problems

                                                                In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                3 Transport Layer 51Comp 361 Spring 2005

                                                                3 Transport Layer 52Comp 361 Spring 2005

                                                                Selective Repeat

                                                                receiver individually acknowledges all correctly received pkts

                                                                buffers pkts as needed for eventual in-order delivery to upper layer

                                                                sender only resends pkts for which ACK not received

                                                                sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                3 Transport Layer 53Comp 361 Spring 2005

                                                                Selective repeat sender receiver windows

                                                                3 Transport Layer 54Comp 361 Spring 2005

                                                                Selective repeat

                                                                pkt n in [rcvbase rcvbase+N-1]

                                                                send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                pkt n in [rcvbase-Nrcvbase-1]

                                                                ACK(n) (note this is a reACK)

                                                                otherwiseignore

                                                                receiverdata from above

                                                                if next available seq in window send pkt

                                                                timeout(n)resend pkt n restart timer

                                                                ACK(n) in [sendbasesendbase+N]

                                                                mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                sender

                                                                3 Transport Layer 55Comp 361 Spring 2005

                                                                Selective repeat in action

                                                                3 Transport Layer 56Comp 361 Spring 2005

                                                                Selective repeatdilemma

                                                                Example seq rsquos 0 1 2 3window size=3

                                                                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                Q what is relationship between seq size and window size

                                                                3 Transport Layer 57Comp 361 Spring 2005

                                                                Chapter 3 outline

                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                35 Connection-oriented transport TCP

                                                                segment structurereliable data transferflow controlconnection management

                                                                36 Principles of congestion control37 TCP congestion control

                                                                3 Transport Layer 58Comp 361 Spring 2005

                                                                TCP Overview RFCs 793 1122 1323 2018 2581

                                                                full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                flow controlledsender will not overwhelm receiver

                                                                point-to-pointone sender one receiver

                                                                reliable in-order byte steam

                                                                no ldquomessage boundariesrdquopipelined

                                                                TCP congestion and flow control set window size

                                                                send amp receive buffers

                                                                socketdoor

                                                                TCPsend buffer

                                                                TCPreceive buffer

                                                                socketdoor

                                                                segment

                                                                applicationwrites data

                                                                applicationreads data

                                                                3 Transport Layer 59Comp 361 Spring 2005

                                                                More TCP DetailsMaximum Segment Size (MSS)

                                                                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                Application Data + TCP Header = TCP Segment

                                                                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                (again no payload)Client responds with third special segment

                                                                This can contain payload

                                                                3 Transport Layer 60Comp 361 Spring 2005

                                                                Even More TCP Details

                                                                A TCP connection between client and server creates in both client and server

                                                                (i) buffers(ii) variables and

                                                                (iii) a socket connection to process

                                                                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                any of the network elements between the host and server

                                                                3 Transport Layer 61Comp 361 Spring 2005

                                                                TCP segment structure

                                                                source port dest port

                                                                32 bits

                                                                applicationdata

                                                                (variable length)

                                                                sequence numberacknowledgement number

                                                                Receive windowUrg data pnterchecksum

                                                                FSRPAUheadlen

                                                                notused

                                                                Options (variable length)

                                                                URG urgent data (generally not used)

                                                                ACK ACK valid

                                                                PSH push data now(generally not used)

                                                                RST SYN FINconnection estab(setup teardown

                                                                commands)

                                                                bytes rcvr willingto accept

                                                                Internetchecksum

                                                                (as in UDP)

                                                                countingby bytes of data(not segments)

                                                                3 Transport Layer 62Comp 361 Spring 2005

                                                                TCP seq rsquos and ACKsSeq rsquos

                                                                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                ACKsseq of next byte expected from other sidecumulative ACK

                                                                Q how receiver handles out-of-order segments

                                                                A TCP spec doesnrsquot say - up to implementer

                                                                Host BHost A

                                                                Seq=42 ACK=79 data = lsquoCrsquo

                                                                Seq=79 ACK=43 data = lsquoCrsquo

                                                                Seq=43 ACK=80

                                                                Usertypes

                                                                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                back lsquoCrsquo

                                                                host ACKsreceipt

                                                                of echoedlsquoCrsquo

                                                                timesimple telnet scenario

                                                                3 Transport Layer 63Comp 361 Spring 2005

                                                                TCP Round Trip Time and Timeout

                                                                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                average several recent measurements not just current SampleRTT

                                                                Q how to set TCP timeout valuelonger than RTT

                                                                but RTT variestoo short premature timeout

                                                                unnecessary retransmissions

                                                                too long slow reaction to segment loss

                                                                3 Transport Layer 64Comp 361 Spring 2005

                                                                TCP Round Trip Time and Timeout

                                                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                3 Transport Layer 65Comp 361 Spring 2005

                                                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                100

                                                                150

                                                                200

                                                                250

                                                                300

                                                                350

                                                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                time (seconnds)

                                                                RTT

                                                                (mill

                                                                iseco

                                                                nds)

                                                                SampleRTT Estimated RTT

                                                                3 Transport Layer 66Comp 361 Spring 2005

                                                                TCP Round Trip Time and Timeout

                                                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                (typically β = 025)

                                                                Then set timeout interval

                                                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                3 Transport Layer 67Comp 361 Spring 2005

                                                                Chapter 3 outline

                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                35 Connection-oriented transport TCP

                                                                segment structurereliable data transferflow controlconnection management

                                                                36 Principles of congestion control37 TCP congestion control

                                                                3 Transport Layer 68Comp 361 Spring 2005

                                                                TCP reliable data transfer

                                                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                Retransmissions are triggered by

                                                                timeout eventsduplicate acks

                                                                Initially consider simplified TCP sender

                                                                ignore duplicate acksignore flow control congestion control

                                                                3 Transport Layer 69Comp 361 Spring 2005

                                                                TCP sender eventsdata rcvd from app

                                                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                timeoutretransmit segment that caused timeoutrestart timer

                                                                Ack rcvdIf acknowledges previously unackedsegments

                                                                update what is known to be ackedstart timer if there are outstanding segments

                                                                TCP sender(simplified)

                                                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                loop (forever) switch(event)

                                                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                event timer timeoutretransmit not-yet-acknowledged segment with

                                                                smallest sequence numberstart timer

                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                start timer

                                                                end of loop forever

                                                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                3 Transport Layer 70Comp 361 Spring 2005

                                                                3 Transport Layer 71Comp 361 Spring 2005

                                                                TCP retransmission scenariosHost A

                                                                Seq=100 20 bytes data

                                                                ACK=100

                                                                timepremature timeout

                                                                Host B

                                                                Seq=92 8 bytes data

                                                                ACK=120

                                                                Seq=92 8 bytes data

                                                                Seq=

                                                                92 t

                                                                imeo

                                                                ut

                                                                ACK=120

                                                                Host A

                                                                Seq=92 8 bytes data

                                                                ACK=100

                                                                loss

                                                                tim

                                                                eout

                                                                lost ACK scenario

                                                                Host B

                                                                X

                                                                Seq=92 8 bytes data

                                                                ACK=100

                                                                time

                                                                SendBase= 120

                                                                SendBase= 120

                                                                Sendbase= 100

                                                                Seq=

                                                                92 t

                                                                imeo

                                                                utSendBase

                                                                = 100

                                                                3 Transport Layer 72Comp 361 Spring 2005

                                                                TCP retransmission scenarios (more)Host A

                                                                Seq=92 8 bytes data

                                                                ACK=100

                                                                loss

                                                                tim

                                                                eout

                                                                Cumulative ACK scenario

                                                                Host B

                                                                X

                                                                Seq=100 20 bytes data

                                                                ACK=120

                                                                time

                                                                SendBase= 120

                                                                3 Transport Layer 73Comp 361 Spring 2005

                                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                                Event at Receiver

                                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                Arrival of segment that partially or completely fills gap

                                                                TCP Receiver action

                                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                                3 Transport Layer 74Comp 361 Spring 2005

                                                                More on Sender Policies

                                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                3 Transport Layer 75Comp 361 Spring 2005

                                                                Fast Retransmit

                                                                Time-out period often relatively long

                                                                long delay before resending lost packet

                                                                Detect lost segments via duplicate ACKs

                                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                fast retransmit resend segment before timer expires

                                                                3 Transport Layer 76Comp 361 Spring 2005

                                                                Fast retransmit algorithm

                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                start timer

                                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                resend segment with sequence number y

                                                                a duplicate ACK for already ACKed segment

                                                                fast retransmit

                                                                3 Transport Layer 77Comp 361 Spring 2005

                                                                TCP GBN or Selective Repeat

                                                                Basic TCP looks a lot like GBN

                                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                This looks a lot like Selective Repeat

                                                                TCP is a hybrid

                                                                3 Transport Layer 78Comp 361 Spring 2005

                                                                Chapter 3 outline

                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                35 Connection-oriented transport TCP

                                                                segment structurereliable data transferflow controlconnection management

                                                                36 Principles of congestion control37 TCP congestion control

                                                                3 Transport Layer 79Comp 361 Spring 2005

                                                                TCP Flow Control

                                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                transmitting too muchtoo fast

                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                app process may be slow at reading from buffer

                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                TCP segment structure

                                                                source port dest port

                                                                32 bits

                                                                applicationdata

                                                                (variable length)

                                                                sequence numberacknowledgement number

                                                                Receive windowUrg data pnterchecksum

                                                                FSRPAUheadlen

                                                                notused

                                                                Options (variable length)

                                                                URG urgent data (generally not used)

                                                                ACK ACK valid

                                                                PSH push data now(generally not used)

                                                                RST SYN FINconnection estab(setup teardown

                                                                commands)

                                                                bytes rcvr willingto accept

                                                                Internetchecksum

                                                                (as in UDP)

                                                                countingby bytes of data(not segments)

                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                TCP Flow control how it works

                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                LastByteRead]

                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                guarantees receive buffer doesnrsquot overflow

                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                Technical Issue

                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                Note on UDP

                                                                UDP has no flow control

                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                Chapter 3 outline

                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                35 Connection-oriented transport TCP

                                                                segment structurereliable data transferflow controlconnection management

                                                                36 Principles of congestion control37 TCP congestion control

                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                TCP Connection Management

                                                                Three way handshakeStep 1 client end system sends

                                                                TCP SYN control segment to server

                                                                specifies client_isn the initial seq No application data

                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                TCP Connection Management (cont)

                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                Allocate buffersAllocates buffersCan include application data

                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                server

                                                                Connection granted (SYN=1 server_isn

                                                                ACK (SYN=0 seq=client_isn+1)

                                                                ack=client_isn+1)

                                                                ack=server_isn+1

                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                TCP Connection Management (cont)

                                                                Closing a connection

                                                                client closes socketclientSocketclose()

                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                client

                                                                FIN

                                                                server

                                                                ACK

                                                                ACK

                                                                FIN

                                                                close

                                                                close

                                                                closed

                                                                tim

                                                                ed w

                                                                ait

                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                TCP Connection Management (cont)

                                                                Step 3 client receives FIN replies with ACK

                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                Closes down after timed-wait

                                                                Step 4 server receives ACK Connection closed

                                                                Note with small modification can handle simultaneous FINs

                                                                client

                                                                FIN

                                                                server

                                                                ACK

                                                                ACK

                                                                FIN

                                                                closing

                                                                closing

                                                                closed

                                                                tim

                                                                ed w

                                                                ait

                                                                closed

                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                TCP Connection Management (cont)

                                                                ExampleTCP serverlifecycle

                                                                Example TCP clientlifecycle

                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                A few special cases

                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                Chapter 3 outline

                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                35 Connection-oriented transport TCP

                                                                segment structurereliable data transferflow controlconnection management

                                                                36 Principles of congestion control37 TCP congestion control

                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                Principles of Congestion Control

                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                a top-10 problem

                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                large delays when congestedmaximum achievable throughput

                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                Causescosts of congestion scenario 2

                                                                one router finite buffers sender retransmission of lost packet

                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                λin λout=

                                                                λin λoutgtλ

                                                                inλout

                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                (c)(a) (b)

                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                λin

                                                                Q what happens as and increase λ

                                                                in

                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                Causescosts of congestion scenario 3

                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                Approaches towards congestion control

                                                                Two broad approaches towards congestion control

                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                Case study ATM ABR congestion control

                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                RM cells returned to sender by receiver with bits intact

                                                                small exception ndash see next page

                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                sender should use available bandwidth

                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                Case study ATM ABR congestion control

                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                Chapter 3 outline

                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                35 Connection-oriented transport TCP

                                                                segment structurereliable data transferflow controlconnection management

                                                                36 Principles of congestion control37 TCP congestion control

                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                Congwin

                                                                w segments each with MSS bytes sent in one RTT

                                                                throughput = w MSSRTT Bytessec

                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                LastByteSent-LastByteAcked le CongWin

                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                cut CongWin in half after loss event

                                                                8 Kbytes

                                                                16 Kbytes

                                                                24 Kbytes

                                                                time

                                                                congestionwindow

                                                                Long-lived TCP connection

                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                TCP Slow Start

                                                                When connection begins CongWin = 1 MSS

                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                available bandwidth may be gtgt MSSRTT

                                                                desirable to quickly ramp up to respectable rate

                                                                When connection begins increase rate exponentially fast until first loss event

                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                TCP Slow Start (more)

                                                                When connection begins increase rate exponentially until first loss event

                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                Host A

                                                                one segment

                                                                RTT

                                                                Host B

                                                                time

                                                                two segments

                                                                four segments

                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                Summary TCP Congestion Control

                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                The Big Picture

                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                ACK receipt for previously unackeddata

                                                                Slow Start (SS)

                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                set state to ldquoCongestion Avoidancerdquo

                                                                Resulting in a doubling of CongWin every RTT

                                                                ACK receipt for previously unackeddata

                                                                CongestionAvoidance (CA)

                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                Loss event detected by triple duplicate ACK

                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                Enter slow start

                                                                Duplicate ACK

                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                CongWin and Threshold not changed

                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                TCP throughput

                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                TCP Futures

                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                LRTTMSSsdot221

                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                TCP connection 1

                                                                bottleneckrouter

                                                                capacity R

                                                                TCP connection 2

                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                Why is TCP fairTwo competing sessions

                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                R

                                                                R

                                                                equal bandwidth share

                                                                Connection 1 throughput

                                                                Conn

                                                                ecti

                                                                on 2

                                                                thr

                                                                ough

                                                                p ut

                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                Fairness (more)Fairness and UDP

                                                                Multimedia apps often do not use TCP

                                                                do not want rate throttled by congestion control

                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                Current Research area How to keep UDP from congesting the internet

                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                TCP Latency ModelingNotation assumptions

                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                modeling slow start

                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                Fixed Congestion Window (W)Two cases

                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                Fixed congestion window (1)

                                                                First caseWSR gt RTT + SR ACK for

                                                                first segment in window returns before windowrsquos worth of data sent

                                                                latency = 2RTT + OR

                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                Fixed congestion window (2)

                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                TCP Latency Modeling Slow Start (1)

                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                Will show that the delay for one object is

                                                                RS

                                                                RSRTTP

                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                ⎤⎢⎣⎡ +++=

                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                - and K is the number of windows that cover the object

                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                TCP Latency Modeling Slow Start (2)

                                                                RTT

                                                                initiate TCPconnection

                                                                requestobject

                                                                first window= SR

                                                                second window= 2SR

                                                                third window= 4SR

                                                                fourth window= 8SR

                                                                completetransmissionobject

                                                                delivered

                                                                time atclient

                                                                time atserver

                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                Server idles P=2 times

                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                Server idles P = minK-1Q times

                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                TCP Latency Modeling (3)

                                                                ementacknowledg receivesserver until

                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                RS

                                                                RSRTTPRTT

                                                                RO

                                                                RSRTT

                                                                RSRTT

                                                                RO

                                                                idleTimeRTTRO

                                                                P

                                                                kP

                                                                k

                                                                P

                                                                pp

                                                                )12(][2

                                                                ]2[2

                                                                2delay

                                                                1

                                                                1

                                                                1

                                                                minusminus+++=

                                                                minus+++=

                                                                ++=

                                                                minus

                                                                =

                                                                =

                                                                sum

                                                                sum

                                                                th window after the timeidle 2 1 kRSRTT

                                                                RS k =⎥⎦

                                                                ⎤⎢⎣⎡ minus+

                                                                +minus

                                                                window kth the transmit totime2 1 =minus

                                                                RSk

                                                                RTT

                                                                initiate TCPconnection

                                                                requestobject

                                                                first window= SR

                                                                second window= 2SR

                                                                third window= 4SR

                                                                fourth window= 8SR

                                                                completetransmissionobject

                                                                delivered

                                                                time atclient

                                                                time atserver

                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                How do we calculate K

                                                                ⎥⎥⎤

                                                                ⎢⎢⎡ +=

                                                                +ge=

                                                                geminus=

                                                                ge+++=

                                                                ge+++=minus

                                                                minus

                                                                )1(log

                                                                )1(logmin

                                                                12min

                                                                222min222min

                                                                2

                                                                2

                                                                110

                                                                110

                                                                SO

                                                                SOkk

                                                                SOk

                                                                SOkOSSSkK

                                                                k

                                                                k

                                                                k

                                                                L

                                                                L

                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                HTTP ModelingAssume Web page consists of

                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                02468

                                                                101214161820

                                                                28Kbps

                                                                100Kbps

                                                                1 Mbps 10Mbps

                                                                non-persistent

                                                                persistent

                                                                parallel non-persistent

                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                HTTP Response time (in seconds)

                                                                0

                                                                10

                                                                20

                                                                30

                                                                40

                                                                50

                                                                60

                                                                70

                                                                28Kbps

                                                                100Kbps

                                                                1 Mbps 10Mbps

                                                                non-persistent

                                                                persistent

                                                                parallel non-persistent

                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                instantiation and implementation in the Internet

                                                                UDPTCP

                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                • Chapter 3 Transport Layer last revised 160305
                                                                • Chapter 3 outline
                                                                • Transport services and protocols
                                                                • Transport vs network layer
                                                                • Transport-layer protocols
                                                                • Chapter 3 outline
                                                                • Multiplexingdemultiplexing
                                                                • Multiplexingdemultiplexing
                                                                • How demultiplexing works
                                                                • Connectionless demultiplexing
                                                                • Connectionless demux (cont)
                                                                • Connection-oriented demux
                                                                • Connection-oriented demux (cont)
                                                                • Connection-oriented demux Threaded Web Server
                                                                • Chapter 3 outline
                                                                • UDP User Datagram Protocol [RFC 768]
                                                                • UDP more
                                                                • UDP checksum
                                                                • Chapter 3 outline
                                                                • Principles of Reliable data transfer
                                                                • Reliable data transfer getting started
                                                                • Reliable data transfer getting started
                                                                • Incremental Improvements
                                                                • Rdt10 reliable transfer over a reliable channel
                                                                • Rdt20 channel with bit errors
                                                                • rdt20 FSM specification
                                                                • rdt20 operation with no errors
                                                                • rdt20 error scenario
                                                                • rdt20 has a fatal flaw
                                                                • rdt21 sender handles garbled ACKNAKs
                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                • rdt21 discussion
                                                                • rdt22 a NAK-free protocol
                                                                • rdt22 sender receiver fragments
                                                                • rdt30 channels with errors and loss
                                                                • rdt30 sender
                                                                • rdt30 in action
                                                                • rdt30 in action
                                                                • Performance of rdt30
                                                                • rdt30 stop-and-wait operation
                                                                • Pipelined protocols
                                                                • Pipelined protocols
                                                                • Pipelining increased utilization
                                                                • Go-Back-N
                                                                • GBN Sender
                                                                • GBN sender extended FSM
                                                                • GBN receiver extended FSM
                                                                • More on receiver
                                                                • GBN inaction
                                                                • Selective Repeat
                                                                • Selective repeat sender receiver windows
                                                                • Selective repeat
                                                                • Selective repeat in action
                                                                • Selective repeat dilemma
                                                                • Chapter 3 outline
                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                • More TCP Details
                                                                • Even More TCP Details
                                                                • TCP segment structure
                                                                • TCP seq rsquos and ACKs
                                                                • TCP Round Trip Time and Timeout
                                                                • TCP Round Trip Time and Timeout
                                                                • Example RTT estimation
                                                                • TCP Round Trip Time and Timeout
                                                                • Chapter 3 outline
                                                                • TCP reliable data transfer
                                                                • TCP sender events
                                                                • TCP sender(simplified)
                                                                • TCP retransmission scenarios
                                                                • TCP retransmission scenarios (more)
                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                • More on Sender Policies
                                                                • Fast Retransmit
                                                                • Fast retransmit algorithm
                                                                • TCP GBN or Selective Repeat
                                                                • Chapter 3 outline
                                                                • TCP Flow Control
                                                                • TCP Flow Control
                                                                • TCP segment structure
                                                                • TCP Flow control how it works
                                                                • Technical Issue
                                                                • Chapter 3 outline
                                                                • TCP Connection Management
                                                                • TCP Connection Management (cont)
                                                                • TCP Connection Management (cont)
                                                                • TCP Connection Management (cont)
                                                                • TCP Connection Management (cont)
                                                                • A few special cases
                                                                • Chapter 3 outline
                                                                • Principles of Congestion Control
                                                                • Causescosts of congestion scenario 1
                                                                • Causescosts of congestion scenario 2
                                                                • Causescosts of congestion scenario 3
                                                                • Causescosts of congestion scenario 3
                                                                • Approaches towards congestion control
                                                                • Case study ATM ABR congestion control
                                                                • Case study ATM ABR congestion control
                                                                • Chapter 3 outline
                                                                • TCP Congestion Control
                                                                • TCP AIMD
                                                                • TCP Slow Start
                                                                • TCP Slow Start (more)
                                                                • Summary TCP Congestion Control
                                                                • The Big Picture
                                                                • TCP sender congestion control
                                                                • TCP throughput
                                                                • TCP Futures
                                                                • TCP Fairness
                                                                • Why is TCP fair
                                                                • Fairness (more)
                                                                • TCP Latency Modeling
                                                                • Fixed Congestion Window (W)
                                                                • Fixed congestion window (1)
                                                                • Fixed congestion window (2)
                                                                • TCP Latency Modeling Slow Start (1)
                                                                • TCP Latency Modeling Slow Start (2)
                                                                • TCP Latency Modeling (3)
                                                                • TCP Latency Modeling (4)
                                                                • HTTP Modeling
                                                                • Chapter 3 Summary

                                                                  3 Transport Layer 33Comp 361 Spring 2005

                                                                  rdt21 discussion

                                                                  Senderseq added to pkttwo seq rsquos (01) will suffice Whymust check if received ACKNAK corrupted twice as many states

                                                                  state must ldquorememberrdquowhether ldquocurrentrdquo pkt has 0 or 1 seq

                                                                  Receivermust check if received packet is duplicate

                                                                  state indicates whether 0 or 1 is expected pkt seq

                                                                  note receiver can notknow if its last ACKNAK received OK at sender

                                                                  3 Transport Layer 34Comp 361 Spring 2005

                                                                  rdt22 a NAK-free protocol

                                                                  same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                                  receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                                  duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                                  3 Transport Layer 35Comp 361 Spring 2005

                                                                  rdt22 sender receiver fragments

                                                                  Wait for call 0 from

                                                                  above

                                                                  sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                                  rdt_send(data)

                                                                  udt_send(sndpkt)

                                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                                  isACK(rcvpkt1) )

                                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                  Wait for ACK

                                                                  0sender FSM

                                                                  fragment

                                                                  Wait for 0 from below

                                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                                  rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                                  has_seq1(rcvpkt))

                                                                  udt_send(sndpkt)receiver FSM

                                                                  fragment

                                                                  Λ

                                                                  3 Transport Layer 36Comp 361 Spring 2005

                                                                  rdt30 channels with errors and loss

                                                                  New assumptionunderlying channel can also lose packets (data or ACKs)

                                                                  checksum seq ACKs retransmissions will be of help but not enough

                                                                  Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                                  Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                                  retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                                  requires countdown timer

                                                                  3 Transport Layer 37Comp 361 Spring 2005

                                                                  rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                                  rdt_send(data)

                                                                  Wait for

                                                                  ACK0

                                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                                  Wait for call 1 from

                                                                  above

                                                                  sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                                  rdt_send(data)

                                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                  rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                                  stop_timerstop_timer

                                                                  udt_send(sndpkt)start_timer

                                                                  timeout

                                                                  udt_send(sndpkt)start_timer

                                                                  timeout

                                                                  rdt_rcv(rcvpkt)

                                                                  Wait for call 0from

                                                                  above

                                                                  Wait for

                                                                  ACK1

                                                                  Λrdt_rcv(rcvpkt)

                                                                  ΛΛ

                                                                  Λ

                                                                  3 Transport Layer 38Comp 361 Spring 2005

                                                                  rdt30 in action

                                                                  3 Transport Layer 39Comp 361 Spring 2005

                                                                  rdt30 in action

                                                                  3 Transport Layer 40Comp 361 Spring 2005

                                                                  Performance of rdt30

                                                                  rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                  L (packet length in bits)R (transmission rate bps)

                                                                  8kbpkt109 bsec

                                                                  Ttransmit = = = 8 microsec

                                                                  U sender =

                                                                  00830008

                                                                  = 000027 L R RTT + L R

                                                                  =

                                                                  U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                  rdt30 stop-and-wait operation

                                                                  first packet bit transmitted t = 0

                                                                  sender receiver

                                                                  RTT

                                                                  last packet bit transmitted t = L R

                                                                  first packet bit arriveslast packet bit arrives send ACK

                                                                  ACK arrives send next packet t = RTT + L R

                                                                  U sender =

                                                                  008 30008

                                                                  = 000027 L R RTT + L R

                                                                  =

                                                                  3 Transport Layer 41Comp 361 Spring 2005

                                                                  3 Transport Layer 42Comp 361 Spring 2005

                                                                  Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                  range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                  3 Transport Layer 43Comp 361 Spring 2005

                                                                  Pipelined protocols

                                                                  Advantage much better bandwidth utilization than stop-and-wait

                                                                  Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                  Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                  Note TCP is not exactly either

                                                                  Pipelining increased utilization

                                                                  first packet bit transmitted t = 0

                                                                  sender receiver

                                                                  RTT

                                                                  last bit transmitted t = L R

                                                                  first packet bit arriveslast packet bit arrives send ACK

                                                                  ACK arrives send next packet t = RTT + L R

                                                                  last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                  U sender =

                                                                  02430008

                                                                  = 00008 3 L R RTT + L R

                                                                  =

                                                                  Increase utilizationby a factor of 3

                                                                  3 Transport Layer 44Comp 361 Spring 2005

                                                                  3 Transport Layer 45Comp 361 Spring 2005

                                                                  Go-Back-NSender

                                                                  k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                  ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                  Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                  3 Transport Layer 46Comp 361 Spring 2005

                                                                  GBN Sender

                                                                  rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                  Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                  Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                  This is only event that triggers resend

                                                                  3 Transport Layer 47Comp 361 Spring 2005

                                                                  GBN sender extended FSMrdt_send(data)

                                                                  Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                  timeout

                                                                  if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                  start_timernextseqnum++

                                                                  elserefuse_data(data)

                                                                  base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                  stop_timerelse

                                                                  start_timer

                                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                  base=1nextseqnum=1

                                                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                  Λ

                                                                  3 Transport Layer 48Comp 361 Spring 2005

                                                                  GBN receiver extended FSM

                                                                  Wait

                                                                  udt_send(sndpkt)default

                                                                  rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                  expectedseqnum=1sndpkt =

                                                                  make_pkt(0ACKchksum)

                                                                  Λ

                                                                  If expected packet receivedSend ACK and deliver packet upstairs

                                                                  If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                  3 Transport Layer 49Comp 361 Spring 2005

                                                                  More on receiver

                                                                  The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                  3 Transport Layer 50Comp 361 Spring 2005

                                                                  GBN inaction

                                                                  GBN is easy to code but might have performance problems

                                                                  In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                  Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                  3 Transport Layer 51Comp 361 Spring 2005

                                                                  3 Transport Layer 52Comp 361 Spring 2005

                                                                  Selective Repeat

                                                                  receiver individually acknowledges all correctly received pkts

                                                                  buffers pkts as needed for eventual in-order delivery to upper layer

                                                                  sender only resends pkts for which ACK not received

                                                                  sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                  sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                  3 Transport Layer 53Comp 361 Spring 2005

                                                                  Selective repeat sender receiver windows

                                                                  3 Transport Layer 54Comp 361 Spring 2005

                                                                  Selective repeat

                                                                  pkt n in [rcvbase rcvbase+N-1]

                                                                  send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                  pkt n in [rcvbase-Nrcvbase-1]

                                                                  ACK(n) (note this is a reACK)

                                                                  otherwiseignore

                                                                  receiverdata from above

                                                                  if next available seq in window send pkt

                                                                  timeout(n)resend pkt n restart timer

                                                                  ACK(n) in [sendbasesendbase+N]

                                                                  mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                  sender

                                                                  3 Transport Layer 55Comp 361 Spring 2005

                                                                  Selective repeat in action

                                                                  3 Transport Layer 56Comp 361 Spring 2005

                                                                  Selective repeatdilemma

                                                                  Example seq rsquos 0 1 2 3window size=3

                                                                  receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                  Q what is relationship between seq size and window size

                                                                  3 Transport Layer 57Comp 361 Spring 2005

                                                                  Chapter 3 outline

                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                  35 Connection-oriented transport TCP

                                                                  segment structurereliable data transferflow controlconnection management

                                                                  36 Principles of congestion control37 TCP congestion control

                                                                  3 Transport Layer 58Comp 361 Spring 2005

                                                                  TCP Overview RFCs 793 1122 1323 2018 2581

                                                                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                  flow controlledsender will not overwhelm receiver

                                                                  point-to-pointone sender one receiver

                                                                  reliable in-order byte steam

                                                                  no ldquomessage boundariesrdquopipelined

                                                                  TCP congestion and flow control set window size

                                                                  send amp receive buffers

                                                                  socketdoor

                                                                  TCPsend buffer

                                                                  TCPreceive buffer

                                                                  socketdoor

                                                                  segment

                                                                  applicationwrites data

                                                                  applicationreads data

                                                                  3 Transport Layer 59Comp 361 Spring 2005

                                                                  More TCP DetailsMaximum Segment Size (MSS)

                                                                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                  Application Data + TCP Header = TCP Segment

                                                                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                  (again no payload)Client responds with third special segment

                                                                  This can contain payload

                                                                  3 Transport Layer 60Comp 361 Spring 2005

                                                                  Even More TCP Details

                                                                  A TCP connection between client and server creates in both client and server

                                                                  (i) buffers(ii) variables and

                                                                  (iii) a socket connection to process

                                                                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                  any of the network elements between the host and server

                                                                  3 Transport Layer 61Comp 361 Spring 2005

                                                                  TCP segment structure

                                                                  source port dest port

                                                                  32 bits

                                                                  applicationdata

                                                                  (variable length)

                                                                  sequence numberacknowledgement number

                                                                  Receive windowUrg data pnterchecksum

                                                                  FSRPAUheadlen

                                                                  notused

                                                                  Options (variable length)

                                                                  URG urgent data (generally not used)

                                                                  ACK ACK valid

                                                                  PSH push data now(generally not used)

                                                                  RST SYN FINconnection estab(setup teardown

                                                                  commands)

                                                                  bytes rcvr willingto accept

                                                                  Internetchecksum

                                                                  (as in UDP)

                                                                  countingby bytes of data(not segments)

                                                                  3 Transport Layer 62Comp 361 Spring 2005

                                                                  TCP seq rsquos and ACKsSeq rsquos

                                                                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                  ACKsseq of next byte expected from other sidecumulative ACK

                                                                  Q how receiver handles out-of-order segments

                                                                  A TCP spec doesnrsquot say - up to implementer

                                                                  Host BHost A

                                                                  Seq=42 ACK=79 data = lsquoCrsquo

                                                                  Seq=79 ACK=43 data = lsquoCrsquo

                                                                  Seq=43 ACK=80

                                                                  Usertypes

                                                                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                  back lsquoCrsquo

                                                                  host ACKsreceipt

                                                                  of echoedlsquoCrsquo

                                                                  timesimple telnet scenario

                                                                  3 Transport Layer 63Comp 361 Spring 2005

                                                                  TCP Round Trip Time and Timeout

                                                                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                  average several recent measurements not just current SampleRTT

                                                                  Q how to set TCP timeout valuelonger than RTT

                                                                  but RTT variestoo short premature timeout

                                                                  unnecessary retransmissions

                                                                  too long slow reaction to segment loss

                                                                  3 Transport Layer 64Comp 361 Spring 2005

                                                                  TCP Round Trip Time and Timeout

                                                                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                  3 Transport Layer 65Comp 361 Spring 2005

                                                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                  100

                                                                  150

                                                                  200

                                                                  250

                                                                  300

                                                                  350

                                                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                  time (seconnds)

                                                                  RTT

                                                                  (mill

                                                                  iseco

                                                                  nds)

                                                                  SampleRTT Estimated RTT

                                                                  3 Transport Layer 66Comp 361 Spring 2005

                                                                  TCP Round Trip Time and Timeout

                                                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                  (typically β = 025)

                                                                  Then set timeout interval

                                                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                  3 Transport Layer 67Comp 361 Spring 2005

                                                                  Chapter 3 outline

                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                  35 Connection-oriented transport TCP

                                                                  segment structurereliable data transferflow controlconnection management

                                                                  36 Principles of congestion control37 TCP congestion control

                                                                  3 Transport Layer 68Comp 361 Spring 2005

                                                                  TCP reliable data transfer

                                                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                  Retransmissions are triggered by

                                                                  timeout eventsduplicate acks

                                                                  Initially consider simplified TCP sender

                                                                  ignore duplicate acksignore flow control congestion control

                                                                  3 Transport Layer 69Comp 361 Spring 2005

                                                                  TCP sender eventsdata rcvd from app

                                                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                  timeoutretransmit segment that caused timeoutrestart timer

                                                                  Ack rcvdIf acknowledges previously unackedsegments

                                                                  update what is known to be ackedstart timer if there are outstanding segments

                                                                  TCP sender(simplified)

                                                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                  loop (forever) switch(event)

                                                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                                                  smallest sequence numberstart timer

                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                  start timer

                                                                  end of loop forever

                                                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                  3 Transport Layer 70Comp 361 Spring 2005

                                                                  3 Transport Layer 71Comp 361 Spring 2005

                                                                  TCP retransmission scenariosHost A

                                                                  Seq=100 20 bytes data

                                                                  ACK=100

                                                                  timepremature timeout

                                                                  Host B

                                                                  Seq=92 8 bytes data

                                                                  ACK=120

                                                                  Seq=92 8 bytes data

                                                                  Seq=

                                                                  92 t

                                                                  imeo

                                                                  ut

                                                                  ACK=120

                                                                  Host A

                                                                  Seq=92 8 bytes data

                                                                  ACK=100

                                                                  loss

                                                                  tim

                                                                  eout

                                                                  lost ACK scenario

                                                                  Host B

                                                                  X

                                                                  Seq=92 8 bytes data

                                                                  ACK=100

                                                                  time

                                                                  SendBase= 120

                                                                  SendBase= 120

                                                                  Sendbase= 100

                                                                  Seq=

                                                                  92 t

                                                                  imeo

                                                                  utSendBase

                                                                  = 100

                                                                  3 Transport Layer 72Comp 361 Spring 2005

                                                                  TCP retransmission scenarios (more)Host A

                                                                  Seq=92 8 bytes data

                                                                  ACK=100

                                                                  loss

                                                                  tim

                                                                  eout

                                                                  Cumulative ACK scenario

                                                                  Host B

                                                                  X

                                                                  Seq=100 20 bytes data

                                                                  ACK=120

                                                                  time

                                                                  SendBase= 120

                                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                                  Event at Receiver

                                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                  Arrival of segment that partially or completely fills gap

                                                                  TCP Receiver action

                                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                                  More on Sender Policies

                                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                                  Fast Retransmit

                                                                  Time-out period often relatively long

                                                                  long delay before resending lost packet

                                                                  Detect lost segments via duplicate ACKs

                                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                  fast retransmit resend segment before timer expires

                                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                                  Fast retransmit algorithm

                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                  start timer

                                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                  resend segment with sequence number y

                                                                  a duplicate ACK for already ACKed segment

                                                                  fast retransmit

                                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                                  TCP GBN or Selective Repeat

                                                                  Basic TCP looks a lot like GBN

                                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                  This looks a lot like Selective Repeat

                                                                  TCP is a hybrid

                                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                                  Chapter 3 outline

                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                  35 Connection-oriented transport TCP

                                                                  segment structurereliable data transferflow controlconnection management

                                                                  36 Principles of congestion control37 TCP congestion control

                                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                                  TCP Flow Control

                                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                  transmitting too muchtoo fast

                                                                  flow controlreceive side of TCP connection has a receive buffer

                                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                  app process may be slow at reading from buffer

                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                  TCP segment structure

                                                                  source port dest port

                                                                  32 bits

                                                                  applicationdata

                                                                  (variable length)

                                                                  sequence numberacknowledgement number

                                                                  Receive windowUrg data pnterchecksum

                                                                  FSRPAUheadlen

                                                                  notused

                                                                  Options (variable length)

                                                                  URG urgent data (generally not used)

                                                                  ACK ACK valid

                                                                  PSH push data now(generally not used)

                                                                  RST SYN FINconnection estab(setup teardown

                                                                  commands)

                                                                  bytes rcvr willingto accept

                                                                  Internetchecksum

                                                                  (as in UDP)

                                                                  countingby bytes of data(not segments)

                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                  TCP Flow control how it works

                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                  LastByteRead]

                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                  guarantees receive buffer doesnrsquot overflow

                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                  Technical Issue

                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                  Note on UDP

                                                                  UDP has no flow control

                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                  Chapter 3 outline

                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                  35 Connection-oriented transport TCP

                                                                  segment structurereliable data transferflow controlconnection management

                                                                  36 Principles of congestion control37 TCP congestion control

                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                  TCP Connection Management

                                                                  Three way handshakeStep 1 client end system sends

                                                                  TCP SYN control segment to server

                                                                  specifies client_isn the initial seq No application data

                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                  TCP Connection Management (cont)

                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                  Allocate buffersAllocates buffersCan include application data

                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                  server

                                                                  Connection granted (SYN=1 server_isn

                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                  ack=client_isn+1)

                                                                  ack=server_isn+1

                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                  TCP Connection Management (cont)

                                                                  Closing a connection

                                                                  client closes socketclientSocketclose()

                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                  client

                                                                  FIN

                                                                  server

                                                                  ACK

                                                                  ACK

                                                                  FIN

                                                                  close

                                                                  close

                                                                  closed

                                                                  tim

                                                                  ed w

                                                                  ait

                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                  TCP Connection Management (cont)

                                                                  Step 3 client receives FIN replies with ACK

                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                  Closes down after timed-wait

                                                                  Step 4 server receives ACK Connection closed

                                                                  Note with small modification can handle simultaneous FINs

                                                                  client

                                                                  FIN

                                                                  server

                                                                  ACK

                                                                  ACK

                                                                  FIN

                                                                  closing

                                                                  closing

                                                                  closed

                                                                  tim

                                                                  ed w

                                                                  ait

                                                                  closed

                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                  TCP Connection Management (cont)

                                                                  ExampleTCP serverlifecycle

                                                                  Example TCP clientlifecycle

                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                  A few special cases

                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                  Chapter 3 outline

                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                  35 Connection-oriented transport TCP

                                                                  segment structurereliable data transferflow controlconnection management

                                                                  36 Principles of congestion control37 TCP congestion control

                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                  Principles of Congestion Control

                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                  a top-10 problem

                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                  large delays when congestedmaximum achievable throughput

                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                  Causescosts of congestion scenario 2

                                                                  one router finite buffers sender retransmission of lost packet

                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                  λin λout=

                                                                  λin λoutgtλ

                                                                  inλout

                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                  (c)(a) (b)

                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                  λin

                                                                  Q what happens as and increase λ

                                                                  in

                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                  Causescosts of congestion scenario 3

                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                  Approaches towards congestion control

                                                                  Two broad approaches towards congestion control

                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                  Case study ATM ABR congestion control

                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                  RM cells returned to sender by receiver with bits intact

                                                                  small exception ndash see next page

                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                  sender should use available bandwidth

                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                  Case study ATM ABR congestion control

                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                  Chapter 3 outline

                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                  35 Connection-oriented transport TCP

                                                                  segment structurereliable data transferflow controlconnection management

                                                                  36 Principles of congestion control37 TCP congestion control

                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                  Congwin

                                                                  w segments each with MSS bytes sent in one RTT

                                                                  throughput = w MSSRTT Bytessec

                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                  LastByteSent-LastByteAcked le CongWin

                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                  cut CongWin in half after loss event

                                                                  8 Kbytes

                                                                  16 Kbytes

                                                                  24 Kbytes

                                                                  time

                                                                  congestionwindow

                                                                  Long-lived TCP connection

                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                  TCP Slow Start

                                                                  When connection begins CongWin = 1 MSS

                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                  available bandwidth may be gtgt MSSRTT

                                                                  desirable to quickly ramp up to respectable rate

                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                  TCP Slow Start (more)

                                                                  When connection begins increase rate exponentially until first loss event

                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                  Host A

                                                                  one segment

                                                                  RTT

                                                                  Host B

                                                                  time

                                                                  two segments

                                                                  four segments

                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                  Summary TCP Congestion Control

                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                  The Big Picture

                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                  ACK receipt for previously unackeddata

                                                                  Slow Start (SS)

                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                  Resulting in a doubling of CongWin every RTT

                                                                  ACK receipt for previously unackeddata

                                                                  CongestionAvoidance (CA)

                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                  Loss event detected by triple duplicate ACK

                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                  Enter slow start

                                                                  Duplicate ACK

                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                  CongWin and Threshold not changed

                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                  TCP throughput

                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                  TCP Futures

                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                  LRTTMSSsdot221

                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                  TCP connection 1

                                                                  bottleneckrouter

                                                                  capacity R

                                                                  TCP connection 2

                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                  Why is TCP fairTwo competing sessions

                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                  R

                                                                  R

                                                                  equal bandwidth share

                                                                  Connection 1 throughput

                                                                  Conn

                                                                  ecti

                                                                  on 2

                                                                  thr

                                                                  ough

                                                                  p ut

                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                  Fairness (more)Fairness and UDP

                                                                  Multimedia apps often do not use TCP

                                                                  do not want rate throttled by congestion control

                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                  Current Research area How to keep UDP from congesting the internet

                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                  TCP Latency ModelingNotation assumptions

                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                  modeling slow start

                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                  Fixed Congestion Window (W)Two cases

                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                  Fixed congestion window (1)

                                                                  First caseWSR gt RTT + SR ACK for

                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                  latency = 2RTT + OR

                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                  Fixed congestion window (2)

                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                  TCP Latency Modeling Slow Start (1)

                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                  Will show that the delay for one object is

                                                                  RS

                                                                  RSRTTP

                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                  ⎤⎢⎣⎡ +++=

                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                  - and K is the number of windows that cover the object

                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                  TCP Latency Modeling Slow Start (2)

                                                                  RTT

                                                                  initiate TCPconnection

                                                                  requestobject

                                                                  first window= SR

                                                                  second window= 2SR

                                                                  third window= 4SR

                                                                  fourth window= 8SR

                                                                  completetransmissionobject

                                                                  delivered

                                                                  time atclient

                                                                  time atserver

                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                  Server idles P=2 times

                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                  Server idles P = minK-1Q times

                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                  TCP Latency Modeling (3)

                                                                  ementacknowledg receivesserver until

                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                  RS

                                                                  RSRTTPRTT

                                                                  RO

                                                                  RSRTT

                                                                  RSRTT

                                                                  RO

                                                                  idleTimeRTTRO

                                                                  P

                                                                  kP

                                                                  k

                                                                  P

                                                                  pp

                                                                  )12(][2

                                                                  ]2[2

                                                                  2delay

                                                                  1

                                                                  1

                                                                  1

                                                                  minusminus+++=

                                                                  minus+++=

                                                                  ++=

                                                                  minus

                                                                  =

                                                                  =

                                                                  sum

                                                                  sum

                                                                  th window after the timeidle 2 1 kRSRTT

                                                                  RS k =⎥⎦

                                                                  ⎤⎢⎣⎡ minus+

                                                                  +minus

                                                                  window kth the transmit totime2 1 =minus

                                                                  RSk

                                                                  RTT

                                                                  initiate TCPconnection

                                                                  requestobject

                                                                  first window= SR

                                                                  second window= 2SR

                                                                  third window= 4SR

                                                                  fourth window= 8SR

                                                                  completetransmissionobject

                                                                  delivered

                                                                  time atclient

                                                                  time atserver

                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                  How do we calculate K

                                                                  ⎥⎥⎤

                                                                  ⎢⎢⎡ +=

                                                                  +ge=

                                                                  geminus=

                                                                  ge+++=

                                                                  ge+++=minus

                                                                  minus

                                                                  )1(log

                                                                  )1(logmin

                                                                  12min

                                                                  222min222min

                                                                  2

                                                                  2

                                                                  110

                                                                  110

                                                                  SO

                                                                  SOkk

                                                                  SOk

                                                                  SOkOSSSkK

                                                                  k

                                                                  k

                                                                  k

                                                                  L

                                                                  L

                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                  HTTP ModelingAssume Web page consists of

                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                  02468

                                                                  101214161820

                                                                  28Kbps

                                                                  100Kbps

                                                                  1 Mbps 10Mbps

                                                                  non-persistent

                                                                  persistent

                                                                  parallel non-persistent

                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                  HTTP Response time (in seconds)

                                                                  0

                                                                  10

                                                                  20

                                                                  30

                                                                  40

                                                                  50

                                                                  60

                                                                  70

                                                                  28Kbps

                                                                  100Kbps

                                                                  1 Mbps 10Mbps

                                                                  non-persistent

                                                                  persistent

                                                                  parallel non-persistent

                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                  instantiation and implementation in the Internet

                                                                  UDPTCP

                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                  • Chapter 3 outline
                                                                  • Transport services and protocols
                                                                  • Transport vs network layer
                                                                  • Transport-layer protocols
                                                                  • Chapter 3 outline
                                                                  • Multiplexingdemultiplexing
                                                                  • Multiplexingdemultiplexing
                                                                  • How demultiplexing works
                                                                  • Connectionless demultiplexing
                                                                  • Connectionless demux (cont)
                                                                  • Connection-oriented demux
                                                                  • Connection-oriented demux (cont)
                                                                  • Connection-oriented demux Threaded Web Server
                                                                  • Chapter 3 outline
                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                  • UDP more
                                                                  • UDP checksum
                                                                  • Chapter 3 outline
                                                                  • Principles of Reliable data transfer
                                                                  • Reliable data transfer getting started
                                                                  • Reliable data transfer getting started
                                                                  • Incremental Improvements
                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                  • Rdt20 channel with bit errors
                                                                  • rdt20 FSM specification
                                                                  • rdt20 operation with no errors
                                                                  • rdt20 error scenario
                                                                  • rdt20 has a fatal flaw
                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                  • rdt21 discussion
                                                                  • rdt22 a NAK-free protocol
                                                                  • rdt22 sender receiver fragments
                                                                  • rdt30 channels with errors and loss
                                                                  • rdt30 sender
                                                                  • rdt30 in action
                                                                  • rdt30 in action
                                                                  • Performance of rdt30
                                                                  • rdt30 stop-and-wait operation
                                                                  • Pipelined protocols
                                                                  • Pipelined protocols
                                                                  • Pipelining increased utilization
                                                                  • Go-Back-N
                                                                  • GBN Sender
                                                                  • GBN sender extended FSM
                                                                  • GBN receiver extended FSM
                                                                  • More on receiver
                                                                  • GBN inaction
                                                                  • Selective Repeat
                                                                  • Selective repeat sender receiver windows
                                                                  • Selective repeat
                                                                  • Selective repeat in action
                                                                  • Selective repeat dilemma
                                                                  • Chapter 3 outline
                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                  • More TCP Details
                                                                  • Even More TCP Details
                                                                  • TCP segment structure
                                                                  • TCP seq rsquos and ACKs
                                                                  • TCP Round Trip Time and Timeout
                                                                  • TCP Round Trip Time and Timeout
                                                                  • Example RTT estimation
                                                                  • TCP Round Trip Time and Timeout
                                                                  • Chapter 3 outline
                                                                  • TCP reliable data transfer
                                                                  • TCP sender events
                                                                  • TCP sender(simplified)
                                                                  • TCP retransmission scenarios
                                                                  • TCP retransmission scenarios (more)
                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                  • More on Sender Policies
                                                                  • Fast Retransmit
                                                                  • Fast retransmit algorithm
                                                                  • TCP GBN or Selective Repeat
                                                                  • Chapter 3 outline
                                                                  • TCP Flow Control
                                                                  • TCP Flow Control
                                                                  • TCP segment structure
                                                                  • TCP Flow control how it works
                                                                  • Technical Issue
                                                                  • Chapter 3 outline
                                                                  • TCP Connection Management
                                                                  • TCP Connection Management (cont)
                                                                  • TCP Connection Management (cont)
                                                                  • TCP Connection Management (cont)
                                                                  • TCP Connection Management (cont)
                                                                  • A few special cases
                                                                  • Chapter 3 outline
                                                                  • Principles of Congestion Control
                                                                  • Causescosts of congestion scenario 1
                                                                  • Causescosts of congestion scenario 2
                                                                  • Causescosts of congestion scenario 3
                                                                  • Causescosts of congestion scenario 3
                                                                  • Approaches towards congestion control
                                                                  • Case study ATM ABR congestion control
                                                                  • Case study ATM ABR congestion control
                                                                  • Chapter 3 outline
                                                                  • TCP Congestion Control
                                                                  • TCP AIMD
                                                                  • TCP Slow Start
                                                                  • TCP Slow Start (more)
                                                                  • Summary TCP Congestion Control
                                                                  • The Big Picture
                                                                  • TCP sender congestion control
                                                                  • TCP throughput
                                                                  • TCP Futures
                                                                  • TCP Fairness
                                                                  • Why is TCP fair
                                                                  • Fairness (more)
                                                                  • TCP Latency Modeling
                                                                  • Fixed Congestion Window (W)
                                                                  • Fixed congestion window (1)
                                                                  • Fixed congestion window (2)
                                                                  • TCP Latency Modeling Slow Start (1)
                                                                  • TCP Latency Modeling Slow Start (2)
                                                                  • TCP Latency Modeling (3)
                                                                  • TCP Latency Modeling (4)
                                                                  • HTTP Modeling
                                                                  • Chapter 3 Summary

                                                                    3 Transport Layer 34Comp 361 Spring 2005

                                                                    rdt22 a NAK-free protocol

                                                                    same functionality as rdt21 using ACKs onlyinstead of NAK receiver sends ACK for last pkt received OK

                                                                    receiver must explicitly include seq of pkt being ACKed(in 21 seq s included in data packets but not in ACKsNAKs)

                                                                    duplicate ACK at sender results in same action as NAK retransmit current pkt

                                                                    3 Transport Layer 35Comp 361 Spring 2005

                                                                    rdt22 sender receiver fragments

                                                                    Wait for call 0 from

                                                                    above

                                                                    sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                                    rdt_send(data)

                                                                    udt_send(sndpkt)

                                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                                    isACK(rcvpkt1) )

                                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                    Wait for ACK

                                                                    0sender FSM

                                                                    fragment

                                                                    Wait for 0 from below

                                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                                    rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                                    has_seq1(rcvpkt))

                                                                    udt_send(sndpkt)receiver FSM

                                                                    fragment

                                                                    Λ

                                                                    3 Transport Layer 36Comp 361 Spring 2005

                                                                    rdt30 channels with errors and loss

                                                                    New assumptionunderlying channel can also lose packets (data or ACKs)

                                                                    checksum seq ACKs retransmissions will be of help but not enough

                                                                    Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                                    Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                                    retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                                    requires countdown timer

                                                                    3 Transport Layer 37Comp 361 Spring 2005

                                                                    rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                                    rdt_send(data)

                                                                    Wait for

                                                                    ACK0

                                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                                    Wait for call 1 from

                                                                    above

                                                                    sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                                    rdt_send(data)

                                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                    rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                                    stop_timerstop_timer

                                                                    udt_send(sndpkt)start_timer

                                                                    timeout

                                                                    udt_send(sndpkt)start_timer

                                                                    timeout

                                                                    rdt_rcv(rcvpkt)

                                                                    Wait for call 0from

                                                                    above

                                                                    Wait for

                                                                    ACK1

                                                                    Λrdt_rcv(rcvpkt)

                                                                    ΛΛ

                                                                    Λ

                                                                    3 Transport Layer 38Comp 361 Spring 2005

                                                                    rdt30 in action

                                                                    3 Transport Layer 39Comp 361 Spring 2005

                                                                    rdt30 in action

                                                                    3 Transport Layer 40Comp 361 Spring 2005

                                                                    Performance of rdt30

                                                                    rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                    L (packet length in bits)R (transmission rate bps)

                                                                    8kbpkt109 bsec

                                                                    Ttransmit = = = 8 microsec

                                                                    U sender =

                                                                    00830008

                                                                    = 000027 L R RTT + L R

                                                                    =

                                                                    U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                    rdt30 stop-and-wait operation

                                                                    first packet bit transmitted t = 0

                                                                    sender receiver

                                                                    RTT

                                                                    last packet bit transmitted t = L R

                                                                    first packet bit arriveslast packet bit arrives send ACK

                                                                    ACK arrives send next packet t = RTT + L R

                                                                    U sender =

                                                                    008 30008

                                                                    = 000027 L R RTT + L R

                                                                    =

                                                                    3 Transport Layer 41Comp 361 Spring 2005

                                                                    3 Transport Layer 42Comp 361 Spring 2005

                                                                    Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                    range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                    3 Transport Layer 43Comp 361 Spring 2005

                                                                    Pipelined protocols

                                                                    Advantage much better bandwidth utilization than stop-and-wait

                                                                    Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                    Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                    Note TCP is not exactly either

                                                                    Pipelining increased utilization

                                                                    first packet bit transmitted t = 0

                                                                    sender receiver

                                                                    RTT

                                                                    last bit transmitted t = L R

                                                                    first packet bit arriveslast packet bit arrives send ACK

                                                                    ACK arrives send next packet t = RTT + L R

                                                                    last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                    U sender =

                                                                    02430008

                                                                    = 00008 3 L R RTT + L R

                                                                    =

                                                                    Increase utilizationby a factor of 3

                                                                    3 Transport Layer 44Comp 361 Spring 2005

                                                                    3 Transport Layer 45Comp 361 Spring 2005

                                                                    Go-Back-NSender

                                                                    k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                    ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                    Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                    3 Transport Layer 46Comp 361 Spring 2005

                                                                    GBN Sender

                                                                    rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                    Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                    Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                    This is only event that triggers resend

                                                                    3 Transport Layer 47Comp 361 Spring 2005

                                                                    GBN sender extended FSMrdt_send(data)

                                                                    Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                    timeout

                                                                    if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                    start_timernextseqnum++

                                                                    elserefuse_data(data)

                                                                    base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                    stop_timerelse

                                                                    start_timer

                                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                    base=1nextseqnum=1

                                                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                    Λ

                                                                    3 Transport Layer 48Comp 361 Spring 2005

                                                                    GBN receiver extended FSM

                                                                    Wait

                                                                    udt_send(sndpkt)default

                                                                    rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                    expectedseqnum=1sndpkt =

                                                                    make_pkt(0ACKchksum)

                                                                    Λ

                                                                    If expected packet receivedSend ACK and deliver packet upstairs

                                                                    If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                    3 Transport Layer 49Comp 361 Spring 2005

                                                                    More on receiver

                                                                    The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                    3 Transport Layer 50Comp 361 Spring 2005

                                                                    GBN inaction

                                                                    GBN is easy to code but might have performance problems

                                                                    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                    3 Transport Layer 51Comp 361 Spring 2005

                                                                    3 Transport Layer 52Comp 361 Spring 2005

                                                                    Selective Repeat

                                                                    receiver individually acknowledges all correctly received pkts

                                                                    buffers pkts as needed for eventual in-order delivery to upper layer

                                                                    sender only resends pkts for which ACK not received

                                                                    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                    3 Transport Layer 53Comp 361 Spring 2005

                                                                    Selective repeat sender receiver windows

                                                                    3 Transport Layer 54Comp 361 Spring 2005

                                                                    Selective repeat

                                                                    pkt n in [rcvbase rcvbase+N-1]

                                                                    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                    pkt n in [rcvbase-Nrcvbase-1]

                                                                    ACK(n) (note this is a reACK)

                                                                    otherwiseignore

                                                                    receiverdata from above

                                                                    if next available seq in window send pkt

                                                                    timeout(n)resend pkt n restart timer

                                                                    ACK(n) in [sendbasesendbase+N]

                                                                    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                    sender

                                                                    3 Transport Layer 55Comp 361 Spring 2005

                                                                    Selective repeat in action

                                                                    3 Transport Layer 56Comp 361 Spring 2005

                                                                    Selective repeatdilemma

                                                                    Example seq rsquos 0 1 2 3window size=3

                                                                    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                    Q what is relationship between seq size and window size

                                                                    3 Transport Layer 57Comp 361 Spring 2005

                                                                    Chapter 3 outline

                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                    35 Connection-oriented transport TCP

                                                                    segment structurereliable data transferflow controlconnection management

                                                                    36 Principles of congestion control37 TCP congestion control

                                                                    3 Transport Layer 58Comp 361 Spring 2005

                                                                    TCP Overview RFCs 793 1122 1323 2018 2581

                                                                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                    flow controlledsender will not overwhelm receiver

                                                                    point-to-pointone sender one receiver

                                                                    reliable in-order byte steam

                                                                    no ldquomessage boundariesrdquopipelined

                                                                    TCP congestion and flow control set window size

                                                                    send amp receive buffers

                                                                    socketdoor

                                                                    TCPsend buffer

                                                                    TCPreceive buffer

                                                                    socketdoor

                                                                    segment

                                                                    applicationwrites data

                                                                    applicationreads data

                                                                    3 Transport Layer 59Comp 361 Spring 2005

                                                                    More TCP DetailsMaximum Segment Size (MSS)

                                                                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                    Application Data + TCP Header = TCP Segment

                                                                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                    (again no payload)Client responds with third special segment

                                                                    This can contain payload

                                                                    3 Transport Layer 60Comp 361 Spring 2005

                                                                    Even More TCP Details

                                                                    A TCP connection between client and server creates in both client and server

                                                                    (i) buffers(ii) variables and

                                                                    (iii) a socket connection to process

                                                                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                    any of the network elements between the host and server

                                                                    3 Transport Layer 61Comp 361 Spring 2005

                                                                    TCP segment structure

                                                                    source port dest port

                                                                    32 bits

                                                                    applicationdata

                                                                    (variable length)

                                                                    sequence numberacknowledgement number

                                                                    Receive windowUrg data pnterchecksum

                                                                    FSRPAUheadlen

                                                                    notused

                                                                    Options (variable length)

                                                                    URG urgent data (generally not used)

                                                                    ACK ACK valid

                                                                    PSH push data now(generally not used)

                                                                    RST SYN FINconnection estab(setup teardown

                                                                    commands)

                                                                    bytes rcvr willingto accept

                                                                    Internetchecksum

                                                                    (as in UDP)

                                                                    countingby bytes of data(not segments)

                                                                    3 Transport Layer 62Comp 361 Spring 2005

                                                                    TCP seq rsquos and ACKsSeq rsquos

                                                                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                    ACKsseq of next byte expected from other sidecumulative ACK

                                                                    Q how receiver handles out-of-order segments

                                                                    A TCP spec doesnrsquot say - up to implementer

                                                                    Host BHost A

                                                                    Seq=42 ACK=79 data = lsquoCrsquo

                                                                    Seq=79 ACK=43 data = lsquoCrsquo

                                                                    Seq=43 ACK=80

                                                                    Usertypes

                                                                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                    back lsquoCrsquo

                                                                    host ACKsreceipt

                                                                    of echoedlsquoCrsquo

                                                                    timesimple telnet scenario

                                                                    3 Transport Layer 63Comp 361 Spring 2005

                                                                    TCP Round Trip Time and Timeout

                                                                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                    average several recent measurements not just current SampleRTT

                                                                    Q how to set TCP timeout valuelonger than RTT

                                                                    but RTT variestoo short premature timeout

                                                                    unnecessary retransmissions

                                                                    too long slow reaction to segment loss

                                                                    3 Transport Layer 64Comp 361 Spring 2005

                                                                    TCP Round Trip Time and Timeout

                                                                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                    3 Transport Layer 65Comp 361 Spring 2005

                                                                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                    100

                                                                    150

                                                                    200

                                                                    250

                                                                    300

                                                                    350

                                                                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                    time (seconnds)

                                                                    RTT

                                                                    (mill

                                                                    iseco

                                                                    nds)

                                                                    SampleRTT Estimated RTT

                                                                    3 Transport Layer 66Comp 361 Spring 2005

                                                                    TCP Round Trip Time and Timeout

                                                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                    (typically β = 025)

                                                                    Then set timeout interval

                                                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                    3 Transport Layer 67Comp 361 Spring 2005

                                                                    Chapter 3 outline

                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                    35 Connection-oriented transport TCP

                                                                    segment structurereliable data transferflow controlconnection management

                                                                    36 Principles of congestion control37 TCP congestion control

                                                                    3 Transport Layer 68Comp 361 Spring 2005

                                                                    TCP reliable data transfer

                                                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                    Retransmissions are triggered by

                                                                    timeout eventsduplicate acks

                                                                    Initially consider simplified TCP sender

                                                                    ignore duplicate acksignore flow control congestion control

                                                                    3 Transport Layer 69Comp 361 Spring 2005

                                                                    TCP sender eventsdata rcvd from app

                                                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                    timeoutretransmit segment that caused timeoutrestart timer

                                                                    Ack rcvdIf acknowledges previously unackedsegments

                                                                    update what is known to be ackedstart timer if there are outstanding segments

                                                                    TCP sender(simplified)

                                                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                    loop (forever) switch(event)

                                                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                                                    smallest sequence numberstart timer

                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                    start timer

                                                                    end of loop forever

                                                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                    3 Transport Layer 70Comp 361 Spring 2005

                                                                    3 Transport Layer 71Comp 361 Spring 2005

                                                                    TCP retransmission scenariosHost A

                                                                    Seq=100 20 bytes data

                                                                    ACK=100

                                                                    timepremature timeout

                                                                    Host B

                                                                    Seq=92 8 bytes data

                                                                    ACK=120

                                                                    Seq=92 8 bytes data

                                                                    Seq=

                                                                    92 t

                                                                    imeo

                                                                    ut

                                                                    ACK=120

                                                                    Host A

                                                                    Seq=92 8 bytes data

                                                                    ACK=100

                                                                    loss

                                                                    tim

                                                                    eout

                                                                    lost ACK scenario

                                                                    Host B

                                                                    X

                                                                    Seq=92 8 bytes data

                                                                    ACK=100

                                                                    time

                                                                    SendBase= 120

                                                                    SendBase= 120

                                                                    Sendbase= 100

                                                                    Seq=

                                                                    92 t

                                                                    imeo

                                                                    utSendBase

                                                                    = 100

                                                                    3 Transport Layer 72Comp 361 Spring 2005

                                                                    TCP retransmission scenarios (more)Host A

                                                                    Seq=92 8 bytes data

                                                                    ACK=100

                                                                    loss

                                                                    tim

                                                                    eout

                                                                    Cumulative ACK scenario

                                                                    Host B

                                                                    X

                                                                    Seq=100 20 bytes data

                                                                    ACK=120

                                                                    time

                                                                    SendBase= 120

                                                                    3 Transport Layer 73Comp 361 Spring 2005

                                                                    TCP ACK generation [RFC 1122 RFC 2581]

                                                                    Event at Receiver

                                                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                    Arrival of segment that partially or completely fills gap

                                                                    TCP Receiver action

                                                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                    Immediately send single cumulative ACK ACKing both in-order segments

                                                                    Immediately send duplicate ACK indicating seq of next expected byte

                                                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                                    More on Sender Policies

                                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                                    Fast Retransmit

                                                                    Time-out period often relatively long

                                                                    long delay before resending lost packet

                                                                    Detect lost segments via duplicate ACKs

                                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                    fast retransmit resend segment before timer expires

                                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                                    Fast retransmit algorithm

                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                    start timer

                                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                    resend segment with sequence number y

                                                                    a duplicate ACK for already ACKed segment

                                                                    fast retransmit

                                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                                    TCP GBN or Selective Repeat

                                                                    Basic TCP looks a lot like GBN

                                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                    This looks a lot like Selective Repeat

                                                                    TCP is a hybrid

                                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                                    Chapter 3 outline

                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                    35 Connection-oriented transport TCP

                                                                    segment structurereliable data transferflow controlconnection management

                                                                    36 Principles of congestion control37 TCP congestion control

                                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                                    TCP Flow Control

                                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                    transmitting too muchtoo fast

                                                                    flow controlreceive side of TCP connection has a receive buffer

                                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                    app process may be slow at reading from buffer

                                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                                    TCP segment structure

                                                                    source port dest port

                                                                    32 bits

                                                                    applicationdata

                                                                    (variable length)

                                                                    sequence numberacknowledgement number

                                                                    Receive windowUrg data pnterchecksum

                                                                    FSRPAUheadlen

                                                                    notused

                                                                    Options (variable length)

                                                                    URG urgent data (generally not used)

                                                                    ACK ACK valid

                                                                    PSH push data now(generally not used)

                                                                    RST SYN FINconnection estab(setup teardown

                                                                    commands)

                                                                    bytes rcvr willingto accept

                                                                    Internetchecksum

                                                                    (as in UDP)

                                                                    countingby bytes of data(not segments)

                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                    TCP Flow control how it works

                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                    LastByteRead]

                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                    guarantees receive buffer doesnrsquot overflow

                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                    Technical Issue

                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                    Note on UDP

                                                                    UDP has no flow control

                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                    Chapter 3 outline

                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                    35 Connection-oriented transport TCP

                                                                    segment structurereliable data transferflow controlconnection management

                                                                    36 Principles of congestion control37 TCP congestion control

                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                    TCP Connection Management

                                                                    Three way handshakeStep 1 client end system sends

                                                                    TCP SYN control segment to server

                                                                    specifies client_isn the initial seq No application data

                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                    TCP Connection Management (cont)

                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                    Allocate buffersAllocates buffersCan include application data

                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                    server

                                                                    Connection granted (SYN=1 server_isn

                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                    ack=client_isn+1)

                                                                    ack=server_isn+1

                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                    TCP Connection Management (cont)

                                                                    Closing a connection

                                                                    client closes socketclientSocketclose()

                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                    client

                                                                    FIN

                                                                    server

                                                                    ACK

                                                                    ACK

                                                                    FIN

                                                                    close

                                                                    close

                                                                    closed

                                                                    tim

                                                                    ed w

                                                                    ait

                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                    TCP Connection Management (cont)

                                                                    Step 3 client receives FIN replies with ACK

                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                    Closes down after timed-wait

                                                                    Step 4 server receives ACK Connection closed

                                                                    Note with small modification can handle simultaneous FINs

                                                                    client

                                                                    FIN

                                                                    server

                                                                    ACK

                                                                    ACK

                                                                    FIN

                                                                    closing

                                                                    closing

                                                                    closed

                                                                    tim

                                                                    ed w

                                                                    ait

                                                                    closed

                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                    TCP Connection Management (cont)

                                                                    ExampleTCP serverlifecycle

                                                                    Example TCP clientlifecycle

                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                    A few special cases

                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                    Chapter 3 outline

                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                    35 Connection-oriented transport TCP

                                                                    segment structurereliable data transferflow controlconnection management

                                                                    36 Principles of congestion control37 TCP congestion control

                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                    Principles of Congestion Control

                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                    a top-10 problem

                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                    large delays when congestedmaximum achievable throughput

                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                    Causescosts of congestion scenario 2

                                                                    one router finite buffers sender retransmission of lost packet

                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                    λin λout=

                                                                    λin λoutgtλ

                                                                    inλout

                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                    (c)(a) (b)

                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                    λin

                                                                    Q what happens as and increase λ

                                                                    in

                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                    Causescosts of congestion scenario 3

                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                    Approaches towards congestion control

                                                                    Two broad approaches towards congestion control

                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                    Case study ATM ABR congestion control

                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                    RM cells returned to sender by receiver with bits intact

                                                                    small exception ndash see next page

                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                    sender should use available bandwidth

                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                    Case study ATM ABR congestion control

                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                    Chapter 3 outline

                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                    35 Connection-oriented transport TCP

                                                                    segment structurereliable data transferflow controlconnection management

                                                                    36 Principles of congestion control37 TCP congestion control

                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                    Congwin

                                                                    w segments each with MSS bytes sent in one RTT

                                                                    throughput = w MSSRTT Bytessec

                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                    LastByteSent-LastByteAcked le CongWin

                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                    cut CongWin in half after loss event

                                                                    8 Kbytes

                                                                    16 Kbytes

                                                                    24 Kbytes

                                                                    time

                                                                    congestionwindow

                                                                    Long-lived TCP connection

                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                    TCP Slow Start

                                                                    When connection begins CongWin = 1 MSS

                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                    available bandwidth may be gtgt MSSRTT

                                                                    desirable to quickly ramp up to respectable rate

                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                    TCP Slow Start (more)

                                                                    When connection begins increase rate exponentially until first loss event

                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                    Host A

                                                                    one segment

                                                                    RTT

                                                                    Host B

                                                                    time

                                                                    two segments

                                                                    four segments

                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                    Summary TCP Congestion Control

                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                    The Big Picture

                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                    ACK receipt for previously unackeddata

                                                                    Slow Start (SS)

                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                    Resulting in a doubling of CongWin every RTT

                                                                    ACK receipt for previously unackeddata

                                                                    CongestionAvoidance (CA)

                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                    Loss event detected by triple duplicate ACK

                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                    Enter slow start

                                                                    Duplicate ACK

                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                    CongWin and Threshold not changed

                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                    TCP throughput

                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                    TCP Futures

                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                    LRTTMSSsdot221

                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                    TCP connection 1

                                                                    bottleneckrouter

                                                                    capacity R

                                                                    TCP connection 2

                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                    Why is TCP fairTwo competing sessions

                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                    R

                                                                    R

                                                                    equal bandwidth share

                                                                    Connection 1 throughput

                                                                    Conn

                                                                    ecti

                                                                    on 2

                                                                    thr

                                                                    ough

                                                                    p ut

                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                    Fairness (more)Fairness and UDP

                                                                    Multimedia apps often do not use TCP

                                                                    do not want rate throttled by congestion control

                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                    Current Research area How to keep UDP from congesting the internet

                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                    TCP Latency ModelingNotation assumptions

                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                    modeling slow start

                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                    Fixed Congestion Window (W)Two cases

                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                    Fixed congestion window (1)

                                                                    First caseWSR gt RTT + SR ACK for

                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                    latency = 2RTT + OR

                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                    Fixed congestion window (2)

                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                    TCP Latency Modeling Slow Start (1)

                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                    Will show that the delay for one object is

                                                                    RS

                                                                    RSRTTP

                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                    ⎤⎢⎣⎡ +++=

                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                    - and K is the number of windows that cover the object

                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                    TCP Latency Modeling Slow Start (2)

                                                                    RTT

                                                                    initiate TCPconnection

                                                                    requestobject

                                                                    first window= SR

                                                                    second window= 2SR

                                                                    third window= 4SR

                                                                    fourth window= 8SR

                                                                    completetransmissionobject

                                                                    delivered

                                                                    time atclient

                                                                    time atserver

                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                    Server idles P=2 times

                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                    Server idles P = minK-1Q times

                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                    TCP Latency Modeling (3)

                                                                    ementacknowledg receivesserver until

                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                    RS

                                                                    RSRTTPRTT

                                                                    RO

                                                                    RSRTT

                                                                    RSRTT

                                                                    RO

                                                                    idleTimeRTTRO

                                                                    P

                                                                    kP

                                                                    k

                                                                    P

                                                                    pp

                                                                    )12(][2

                                                                    ]2[2

                                                                    2delay

                                                                    1

                                                                    1

                                                                    1

                                                                    minusminus+++=

                                                                    minus+++=

                                                                    ++=

                                                                    minus

                                                                    =

                                                                    =

                                                                    sum

                                                                    sum

                                                                    th window after the timeidle 2 1 kRSRTT

                                                                    RS k =⎥⎦

                                                                    ⎤⎢⎣⎡ minus+

                                                                    +minus

                                                                    window kth the transmit totime2 1 =minus

                                                                    RSk

                                                                    RTT

                                                                    initiate TCPconnection

                                                                    requestobject

                                                                    first window= SR

                                                                    second window= 2SR

                                                                    third window= 4SR

                                                                    fourth window= 8SR

                                                                    completetransmissionobject

                                                                    delivered

                                                                    time atclient

                                                                    time atserver

                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                    How do we calculate K

                                                                    ⎥⎥⎤

                                                                    ⎢⎢⎡ +=

                                                                    +ge=

                                                                    geminus=

                                                                    ge+++=

                                                                    ge+++=minus

                                                                    minus

                                                                    )1(log

                                                                    )1(logmin

                                                                    12min

                                                                    222min222min

                                                                    2

                                                                    2

                                                                    110

                                                                    110

                                                                    SO

                                                                    SOkk

                                                                    SOk

                                                                    SOkOSSSkK

                                                                    k

                                                                    k

                                                                    k

                                                                    L

                                                                    L

                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                    HTTP ModelingAssume Web page consists of

                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                    02468

                                                                    101214161820

                                                                    28Kbps

                                                                    100Kbps

                                                                    1 Mbps 10Mbps

                                                                    non-persistent

                                                                    persistent

                                                                    parallel non-persistent

                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                    HTTP Response time (in seconds)

                                                                    0

                                                                    10

                                                                    20

                                                                    30

                                                                    40

                                                                    50

                                                                    60

                                                                    70

                                                                    28Kbps

                                                                    100Kbps

                                                                    1 Mbps 10Mbps

                                                                    non-persistent

                                                                    persistent

                                                                    parallel non-persistent

                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                    instantiation and implementation in the Internet

                                                                    UDPTCP

                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                    • Chapter 3 outline
                                                                    • Transport services and protocols
                                                                    • Transport vs network layer
                                                                    • Transport-layer protocols
                                                                    • Chapter 3 outline
                                                                    • Multiplexingdemultiplexing
                                                                    • Multiplexingdemultiplexing
                                                                    • How demultiplexing works
                                                                    • Connectionless demultiplexing
                                                                    • Connectionless demux (cont)
                                                                    • Connection-oriented demux
                                                                    • Connection-oriented demux (cont)
                                                                    • Connection-oriented demux Threaded Web Server
                                                                    • Chapter 3 outline
                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                    • UDP more
                                                                    • UDP checksum
                                                                    • Chapter 3 outline
                                                                    • Principles of Reliable data transfer
                                                                    • Reliable data transfer getting started
                                                                    • Reliable data transfer getting started
                                                                    • Incremental Improvements
                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                    • Rdt20 channel with bit errors
                                                                    • rdt20 FSM specification
                                                                    • rdt20 operation with no errors
                                                                    • rdt20 error scenario
                                                                    • rdt20 has a fatal flaw
                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                    • rdt21 discussion
                                                                    • rdt22 a NAK-free protocol
                                                                    • rdt22 sender receiver fragments
                                                                    • rdt30 channels with errors and loss
                                                                    • rdt30 sender
                                                                    • rdt30 in action
                                                                    • rdt30 in action
                                                                    • Performance of rdt30
                                                                    • rdt30 stop-and-wait operation
                                                                    • Pipelined protocols
                                                                    • Pipelined protocols
                                                                    • Pipelining increased utilization
                                                                    • Go-Back-N
                                                                    • GBN Sender
                                                                    • GBN sender extended FSM
                                                                    • GBN receiver extended FSM
                                                                    • More on receiver
                                                                    • GBN inaction
                                                                    • Selective Repeat
                                                                    • Selective repeat sender receiver windows
                                                                    • Selective repeat
                                                                    • Selective repeat in action
                                                                    • Selective repeat dilemma
                                                                    • Chapter 3 outline
                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                    • More TCP Details
                                                                    • Even More TCP Details
                                                                    • TCP segment structure
                                                                    • TCP seq rsquos and ACKs
                                                                    • TCP Round Trip Time and Timeout
                                                                    • TCP Round Trip Time and Timeout
                                                                    • Example RTT estimation
                                                                    • TCP Round Trip Time and Timeout
                                                                    • Chapter 3 outline
                                                                    • TCP reliable data transfer
                                                                    • TCP sender events
                                                                    • TCP sender(simplified)
                                                                    • TCP retransmission scenarios
                                                                    • TCP retransmission scenarios (more)
                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                    • More on Sender Policies
                                                                    • Fast Retransmit
                                                                    • Fast retransmit algorithm
                                                                    • TCP GBN or Selective Repeat
                                                                    • Chapter 3 outline
                                                                    • TCP Flow Control
                                                                    • TCP Flow Control
                                                                    • TCP segment structure
                                                                    • TCP Flow control how it works
                                                                    • Technical Issue
                                                                    • Chapter 3 outline
                                                                    • TCP Connection Management
                                                                    • TCP Connection Management (cont)
                                                                    • TCP Connection Management (cont)
                                                                    • TCP Connection Management (cont)
                                                                    • TCP Connection Management (cont)
                                                                    • A few special cases
                                                                    • Chapter 3 outline
                                                                    • Principles of Congestion Control
                                                                    • Causescosts of congestion scenario 1
                                                                    • Causescosts of congestion scenario 2
                                                                    • Causescosts of congestion scenario 3
                                                                    • Causescosts of congestion scenario 3
                                                                    • Approaches towards congestion control
                                                                    • Case study ATM ABR congestion control
                                                                    • Case study ATM ABR congestion control
                                                                    • Chapter 3 outline
                                                                    • TCP Congestion Control
                                                                    • TCP AIMD
                                                                    • TCP Slow Start
                                                                    • TCP Slow Start (more)
                                                                    • Summary TCP Congestion Control
                                                                    • The Big Picture
                                                                    • TCP sender congestion control
                                                                    • TCP throughput
                                                                    • TCP Futures
                                                                    • TCP Fairness
                                                                    • Why is TCP fair
                                                                    • Fairness (more)
                                                                    • TCP Latency Modeling
                                                                    • Fixed Congestion Window (W)
                                                                    • Fixed congestion window (1)
                                                                    • Fixed congestion window (2)
                                                                    • TCP Latency Modeling Slow Start (1)
                                                                    • TCP Latency Modeling Slow Start (2)
                                                                    • TCP Latency Modeling (3)
                                                                    • TCP Latency Modeling (4)
                                                                    • HTTP Modeling
                                                                    • Chapter 3 Summary

                                                                      3 Transport Layer 35Comp 361 Spring 2005

                                                                      rdt22 sender receiver fragments

                                                                      Wait for call 0 from

                                                                      above

                                                                      sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)

                                                                      rdt_send(data)

                                                                      udt_send(sndpkt)

                                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||

                                                                      isACK(rcvpkt1) )

                                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                      Wait for ACK

                                                                      0sender FSM

                                                                      fragment

                                                                      Wait for 0 from below

                                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)

                                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)

                                                                      rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) ||

                                                                      has_seq1(rcvpkt))

                                                                      udt_send(sndpkt)receiver FSM

                                                                      fragment

                                                                      Λ

                                                                      3 Transport Layer 36Comp 361 Spring 2005

                                                                      rdt30 channels with errors and loss

                                                                      New assumptionunderlying channel can also lose packets (data or ACKs)

                                                                      checksum seq ACKs retransmissions will be of help but not enough

                                                                      Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                                      Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                                      retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                                      requires countdown timer

                                                                      3 Transport Layer 37Comp 361 Spring 2005

                                                                      rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                                      rdt_send(data)

                                                                      Wait for

                                                                      ACK0

                                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                                      Wait for call 1 from

                                                                      above

                                                                      sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                                      rdt_send(data)

                                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                      rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                                      stop_timerstop_timer

                                                                      udt_send(sndpkt)start_timer

                                                                      timeout

                                                                      udt_send(sndpkt)start_timer

                                                                      timeout

                                                                      rdt_rcv(rcvpkt)

                                                                      Wait for call 0from

                                                                      above

                                                                      Wait for

                                                                      ACK1

                                                                      Λrdt_rcv(rcvpkt)

                                                                      ΛΛ

                                                                      Λ

                                                                      3 Transport Layer 38Comp 361 Spring 2005

                                                                      rdt30 in action

                                                                      3 Transport Layer 39Comp 361 Spring 2005

                                                                      rdt30 in action

                                                                      3 Transport Layer 40Comp 361 Spring 2005

                                                                      Performance of rdt30

                                                                      rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                      L (packet length in bits)R (transmission rate bps)

                                                                      8kbpkt109 bsec

                                                                      Ttransmit = = = 8 microsec

                                                                      U sender =

                                                                      00830008

                                                                      = 000027 L R RTT + L R

                                                                      =

                                                                      U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                      rdt30 stop-and-wait operation

                                                                      first packet bit transmitted t = 0

                                                                      sender receiver

                                                                      RTT

                                                                      last packet bit transmitted t = L R

                                                                      first packet bit arriveslast packet bit arrives send ACK

                                                                      ACK arrives send next packet t = RTT + L R

                                                                      U sender =

                                                                      008 30008

                                                                      = 000027 L R RTT + L R

                                                                      =

                                                                      3 Transport Layer 41Comp 361 Spring 2005

                                                                      3 Transport Layer 42Comp 361 Spring 2005

                                                                      Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                      range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                      3 Transport Layer 43Comp 361 Spring 2005

                                                                      Pipelined protocols

                                                                      Advantage much better bandwidth utilization than stop-and-wait

                                                                      Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                      Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                      Note TCP is not exactly either

                                                                      Pipelining increased utilization

                                                                      first packet bit transmitted t = 0

                                                                      sender receiver

                                                                      RTT

                                                                      last bit transmitted t = L R

                                                                      first packet bit arriveslast packet bit arrives send ACK

                                                                      ACK arrives send next packet t = RTT + L R

                                                                      last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                      U sender =

                                                                      02430008

                                                                      = 00008 3 L R RTT + L R

                                                                      =

                                                                      Increase utilizationby a factor of 3

                                                                      3 Transport Layer 44Comp 361 Spring 2005

                                                                      3 Transport Layer 45Comp 361 Spring 2005

                                                                      Go-Back-NSender

                                                                      k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                      ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                      Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                      3 Transport Layer 46Comp 361 Spring 2005

                                                                      GBN Sender

                                                                      rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                      Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                      Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                      This is only event that triggers resend

                                                                      3 Transport Layer 47Comp 361 Spring 2005

                                                                      GBN sender extended FSMrdt_send(data)

                                                                      Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                      timeout

                                                                      if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                      start_timernextseqnum++

                                                                      elserefuse_data(data)

                                                                      base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                      stop_timerelse

                                                                      start_timer

                                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                      base=1nextseqnum=1

                                                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                      Λ

                                                                      3 Transport Layer 48Comp 361 Spring 2005

                                                                      GBN receiver extended FSM

                                                                      Wait

                                                                      udt_send(sndpkt)default

                                                                      rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                      expectedseqnum=1sndpkt =

                                                                      make_pkt(0ACKchksum)

                                                                      Λ

                                                                      If expected packet receivedSend ACK and deliver packet upstairs

                                                                      If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                      3 Transport Layer 49Comp 361 Spring 2005

                                                                      More on receiver

                                                                      The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                      3 Transport Layer 50Comp 361 Spring 2005

                                                                      GBN inaction

                                                                      GBN is easy to code but might have performance problems

                                                                      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                      3 Transport Layer 51Comp 361 Spring 2005

                                                                      3 Transport Layer 52Comp 361 Spring 2005

                                                                      Selective Repeat

                                                                      receiver individually acknowledges all correctly received pkts

                                                                      buffers pkts as needed for eventual in-order delivery to upper layer

                                                                      sender only resends pkts for which ACK not received

                                                                      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                      3 Transport Layer 53Comp 361 Spring 2005

                                                                      Selective repeat sender receiver windows

                                                                      3 Transport Layer 54Comp 361 Spring 2005

                                                                      Selective repeat

                                                                      pkt n in [rcvbase rcvbase+N-1]

                                                                      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                      pkt n in [rcvbase-Nrcvbase-1]

                                                                      ACK(n) (note this is a reACK)

                                                                      otherwiseignore

                                                                      receiverdata from above

                                                                      if next available seq in window send pkt

                                                                      timeout(n)resend pkt n restart timer

                                                                      ACK(n) in [sendbasesendbase+N]

                                                                      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                      sender

                                                                      3 Transport Layer 55Comp 361 Spring 2005

                                                                      Selective repeat in action

                                                                      3 Transport Layer 56Comp 361 Spring 2005

                                                                      Selective repeatdilemma

                                                                      Example seq rsquos 0 1 2 3window size=3

                                                                      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                      Q what is relationship between seq size and window size

                                                                      3 Transport Layer 57Comp 361 Spring 2005

                                                                      Chapter 3 outline

                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                      35 Connection-oriented transport TCP

                                                                      segment structurereliable data transferflow controlconnection management

                                                                      36 Principles of congestion control37 TCP congestion control

                                                                      3 Transport Layer 58Comp 361 Spring 2005

                                                                      TCP Overview RFCs 793 1122 1323 2018 2581

                                                                      full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                      flow controlledsender will not overwhelm receiver

                                                                      point-to-pointone sender one receiver

                                                                      reliable in-order byte steam

                                                                      no ldquomessage boundariesrdquopipelined

                                                                      TCP congestion and flow control set window size

                                                                      send amp receive buffers

                                                                      socketdoor

                                                                      TCPsend buffer

                                                                      TCPreceive buffer

                                                                      socketdoor

                                                                      segment

                                                                      applicationwrites data

                                                                      applicationreads data

                                                                      3 Transport Layer 59Comp 361 Spring 2005

                                                                      More TCP DetailsMaximum Segment Size (MSS)

                                                                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                      Application Data + TCP Header = TCP Segment

                                                                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                      (again no payload)Client responds with third special segment

                                                                      This can contain payload

                                                                      3 Transport Layer 60Comp 361 Spring 2005

                                                                      Even More TCP Details

                                                                      A TCP connection between client and server creates in both client and server

                                                                      (i) buffers(ii) variables and

                                                                      (iii) a socket connection to process

                                                                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                      any of the network elements between the host and server

                                                                      3 Transport Layer 61Comp 361 Spring 2005

                                                                      TCP segment structure

                                                                      source port dest port

                                                                      32 bits

                                                                      applicationdata

                                                                      (variable length)

                                                                      sequence numberacknowledgement number

                                                                      Receive windowUrg data pnterchecksum

                                                                      FSRPAUheadlen

                                                                      notused

                                                                      Options (variable length)

                                                                      URG urgent data (generally not used)

                                                                      ACK ACK valid

                                                                      PSH push data now(generally not used)

                                                                      RST SYN FINconnection estab(setup teardown

                                                                      commands)

                                                                      bytes rcvr willingto accept

                                                                      Internetchecksum

                                                                      (as in UDP)

                                                                      countingby bytes of data(not segments)

                                                                      3 Transport Layer 62Comp 361 Spring 2005

                                                                      TCP seq rsquos and ACKsSeq rsquos

                                                                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                      ACKsseq of next byte expected from other sidecumulative ACK

                                                                      Q how receiver handles out-of-order segments

                                                                      A TCP spec doesnrsquot say - up to implementer

                                                                      Host BHost A

                                                                      Seq=42 ACK=79 data = lsquoCrsquo

                                                                      Seq=79 ACK=43 data = lsquoCrsquo

                                                                      Seq=43 ACK=80

                                                                      Usertypes

                                                                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                      back lsquoCrsquo

                                                                      host ACKsreceipt

                                                                      of echoedlsquoCrsquo

                                                                      timesimple telnet scenario

                                                                      3 Transport Layer 63Comp 361 Spring 2005

                                                                      TCP Round Trip Time and Timeout

                                                                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                      average several recent measurements not just current SampleRTT

                                                                      Q how to set TCP timeout valuelonger than RTT

                                                                      but RTT variestoo short premature timeout

                                                                      unnecessary retransmissions

                                                                      too long slow reaction to segment loss

                                                                      3 Transport Layer 64Comp 361 Spring 2005

                                                                      TCP Round Trip Time and Timeout

                                                                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                      3 Transport Layer 65Comp 361 Spring 2005

                                                                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                      100

                                                                      150

                                                                      200

                                                                      250

                                                                      300

                                                                      350

                                                                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                      time (seconnds)

                                                                      RTT

                                                                      (mill

                                                                      iseco

                                                                      nds)

                                                                      SampleRTT Estimated RTT

                                                                      3 Transport Layer 66Comp 361 Spring 2005

                                                                      TCP Round Trip Time and Timeout

                                                                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                      (typically β = 025)

                                                                      Then set timeout interval

                                                                      TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                      3 Transport Layer 67Comp 361 Spring 2005

                                                                      Chapter 3 outline

                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                      35 Connection-oriented transport TCP

                                                                      segment structurereliable data transferflow controlconnection management

                                                                      36 Principles of congestion control37 TCP congestion control

                                                                      3 Transport Layer 68Comp 361 Spring 2005

                                                                      TCP reliable data transfer

                                                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                      Retransmissions are triggered by

                                                                      timeout eventsduplicate acks

                                                                      Initially consider simplified TCP sender

                                                                      ignore duplicate acksignore flow control congestion control

                                                                      3 Transport Layer 69Comp 361 Spring 2005

                                                                      TCP sender eventsdata rcvd from app

                                                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                      timeoutretransmit segment that caused timeoutrestart timer

                                                                      Ack rcvdIf acknowledges previously unackedsegments

                                                                      update what is known to be ackedstart timer if there are outstanding segments

                                                                      TCP sender(simplified)

                                                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                      loop (forever) switch(event)

                                                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                                                      smallest sequence numberstart timer

                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                      start timer

                                                                      end of loop forever

                                                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                      3 Transport Layer 70Comp 361 Spring 2005

                                                                      3 Transport Layer 71Comp 361 Spring 2005

                                                                      TCP retransmission scenariosHost A

                                                                      Seq=100 20 bytes data

                                                                      ACK=100

                                                                      timepremature timeout

                                                                      Host B

                                                                      Seq=92 8 bytes data

                                                                      ACK=120

                                                                      Seq=92 8 bytes data

                                                                      Seq=

                                                                      92 t

                                                                      imeo

                                                                      ut

                                                                      ACK=120

                                                                      Host A

                                                                      Seq=92 8 bytes data

                                                                      ACK=100

                                                                      loss

                                                                      tim

                                                                      eout

                                                                      lost ACK scenario

                                                                      Host B

                                                                      X

                                                                      Seq=92 8 bytes data

                                                                      ACK=100

                                                                      time

                                                                      SendBase= 120

                                                                      SendBase= 120

                                                                      Sendbase= 100

                                                                      Seq=

                                                                      92 t

                                                                      imeo

                                                                      utSendBase

                                                                      = 100

                                                                      3 Transport Layer 72Comp 361 Spring 2005

                                                                      TCP retransmission scenarios (more)Host A

                                                                      Seq=92 8 bytes data

                                                                      ACK=100

                                                                      loss

                                                                      tim

                                                                      eout

                                                                      Cumulative ACK scenario

                                                                      Host B

                                                                      X

                                                                      Seq=100 20 bytes data

                                                                      ACK=120

                                                                      time

                                                                      SendBase= 120

                                                                      3 Transport Layer 73Comp 361 Spring 2005

                                                                      TCP ACK generation [RFC 1122 RFC 2581]

                                                                      Event at Receiver

                                                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                      Arrival of segment that partially or completely fills gap

                                                                      TCP Receiver action

                                                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                      Immediately send single cumulative ACK ACKing both in-order segments

                                                                      Immediately send duplicate ACK indicating seq of next expected byte

                                                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                                                      3 Transport Layer 74Comp 361 Spring 2005

                                                                      More on Sender Policies

                                                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                                      Fast Retransmit

                                                                      Time-out period often relatively long

                                                                      long delay before resending lost packet

                                                                      Detect lost segments via duplicate ACKs

                                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                      fast retransmit resend segment before timer expires

                                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                                      Fast retransmit algorithm

                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                      start timer

                                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                      resend segment with sequence number y

                                                                      a duplicate ACK for already ACKed segment

                                                                      fast retransmit

                                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                                      TCP GBN or Selective Repeat

                                                                      Basic TCP looks a lot like GBN

                                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                      This looks a lot like Selective Repeat

                                                                      TCP is a hybrid

                                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                                      Chapter 3 outline

                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                      35 Connection-oriented transport TCP

                                                                      segment structurereliable data transferflow controlconnection management

                                                                      36 Principles of congestion control37 TCP congestion control

                                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                                      TCP Flow Control

                                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                      transmitting too muchtoo fast

                                                                      flow controlreceive side of TCP connection has a receive buffer

                                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                      app process may be slow at reading from buffer

                                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                                      TCP segment structure

                                                                      source port dest port

                                                                      32 bits

                                                                      applicationdata

                                                                      (variable length)

                                                                      sequence numberacknowledgement number

                                                                      Receive windowUrg data pnterchecksum

                                                                      FSRPAUheadlen

                                                                      notused

                                                                      Options (variable length)

                                                                      URG urgent data (generally not used)

                                                                      ACK ACK valid

                                                                      PSH push data now(generally not used)

                                                                      RST SYN FINconnection estab(setup teardown

                                                                      commands)

                                                                      bytes rcvr willingto accept

                                                                      Internetchecksum

                                                                      (as in UDP)

                                                                      countingby bytes of data(not segments)

                                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                                      TCP Flow control how it works

                                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                      LastByteRead]

                                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                      guarantees receive buffer doesnrsquot overflow

                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                      Technical Issue

                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                      Note on UDP

                                                                      UDP has no flow control

                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                      Chapter 3 outline

                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                      35 Connection-oriented transport TCP

                                                                      segment structurereliable data transferflow controlconnection management

                                                                      36 Principles of congestion control37 TCP congestion control

                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                      TCP Connection Management

                                                                      Three way handshakeStep 1 client end system sends

                                                                      TCP SYN control segment to server

                                                                      specifies client_isn the initial seq No application data

                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                      TCP Connection Management (cont)

                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                      Allocate buffersAllocates buffersCan include application data

                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                      server

                                                                      Connection granted (SYN=1 server_isn

                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                      ack=client_isn+1)

                                                                      ack=server_isn+1

                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                      TCP Connection Management (cont)

                                                                      Closing a connection

                                                                      client closes socketclientSocketclose()

                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                      client

                                                                      FIN

                                                                      server

                                                                      ACK

                                                                      ACK

                                                                      FIN

                                                                      close

                                                                      close

                                                                      closed

                                                                      tim

                                                                      ed w

                                                                      ait

                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                      TCP Connection Management (cont)

                                                                      Step 3 client receives FIN replies with ACK

                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                      Closes down after timed-wait

                                                                      Step 4 server receives ACK Connection closed

                                                                      Note with small modification can handle simultaneous FINs

                                                                      client

                                                                      FIN

                                                                      server

                                                                      ACK

                                                                      ACK

                                                                      FIN

                                                                      closing

                                                                      closing

                                                                      closed

                                                                      tim

                                                                      ed w

                                                                      ait

                                                                      closed

                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                      TCP Connection Management (cont)

                                                                      ExampleTCP serverlifecycle

                                                                      Example TCP clientlifecycle

                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                      A few special cases

                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                      Chapter 3 outline

                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                      35 Connection-oriented transport TCP

                                                                      segment structurereliable data transferflow controlconnection management

                                                                      36 Principles of congestion control37 TCP congestion control

                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                      Principles of Congestion Control

                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                      a top-10 problem

                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                      large delays when congestedmaximum achievable throughput

                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                      Causescosts of congestion scenario 2

                                                                      one router finite buffers sender retransmission of lost packet

                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                      λin λout=

                                                                      λin λoutgtλ

                                                                      inλout

                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                      (c)(a) (b)

                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                      λin

                                                                      Q what happens as and increase λ

                                                                      in

                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                      Causescosts of congestion scenario 3

                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                      Approaches towards congestion control

                                                                      Two broad approaches towards congestion control

                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                      Case study ATM ABR congestion control

                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                      RM cells returned to sender by receiver with bits intact

                                                                      small exception ndash see next page

                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                      sender should use available bandwidth

                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                      Case study ATM ABR congestion control

                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                      Chapter 3 outline

                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                      35 Connection-oriented transport TCP

                                                                      segment structurereliable data transferflow controlconnection management

                                                                      36 Principles of congestion control37 TCP congestion control

                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                      Congwin

                                                                      w segments each with MSS bytes sent in one RTT

                                                                      throughput = w MSSRTT Bytessec

                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                      LastByteSent-LastByteAcked le CongWin

                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                      cut CongWin in half after loss event

                                                                      8 Kbytes

                                                                      16 Kbytes

                                                                      24 Kbytes

                                                                      time

                                                                      congestionwindow

                                                                      Long-lived TCP connection

                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                      TCP Slow Start

                                                                      When connection begins CongWin = 1 MSS

                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                      available bandwidth may be gtgt MSSRTT

                                                                      desirable to quickly ramp up to respectable rate

                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                      TCP Slow Start (more)

                                                                      When connection begins increase rate exponentially until first loss event

                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                      Host A

                                                                      one segment

                                                                      RTT

                                                                      Host B

                                                                      time

                                                                      two segments

                                                                      four segments

                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                      Summary TCP Congestion Control

                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                      The Big Picture

                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                      ACK receipt for previously unackeddata

                                                                      Slow Start (SS)

                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                      Resulting in a doubling of CongWin every RTT

                                                                      ACK receipt for previously unackeddata

                                                                      CongestionAvoidance (CA)

                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                      Loss event detected by triple duplicate ACK

                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                      Enter slow start

                                                                      Duplicate ACK

                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                      CongWin and Threshold not changed

                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                      TCP throughput

                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                      TCP Futures

                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                      LRTTMSSsdot221

                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                      TCP connection 1

                                                                      bottleneckrouter

                                                                      capacity R

                                                                      TCP connection 2

                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                      Why is TCP fairTwo competing sessions

                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                      R

                                                                      R

                                                                      equal bandwidth share

                                                                      Connection 1 throughput

                                                                      Conn

                                                                      ecti

                                                                      on 2

                                                                      thr

                                                                      ough

                                                                      p ut

                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                      Fairness (more)Fairness and UDP

                                                                      Multimedia apps often do not use TCP

                                                                      do not want rate throttled by congestion control

                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                      Current Research area How to keep UDP from congesting the internet

                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                      TCP Latency ModelingNotation assumptions

                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                      modeling slow start

                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                      Fixed Congestion Window (W)Two cases

                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                      Fixed congestion window (1)

                                                                      First caseWSR gt RTT + SR ACK for

                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                      latency = 2RTT + OR

                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                      Fixed congestion window (2)

                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                      TCP Latency Modeling Slow Start (1)

                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                      Will show that the delay for one object is

                                                                      RS

                                                                      RSRTTP

                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                      ⎤⎢⎣⎡ +++=

                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                      - and K is the number of windows that cover the object

                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                      TCP Latency Modeling Slow Start (2)

                                                                      RTT

                                                                      initiate TCPconnection

                                                                      requestobject

                                                                      first window= SR

                                                                      second window= 2SR

                                                                      third window= 4SR

                                                                      fourth window= 8SR

                                                                      completetransmissionobject

                                                                      delivered

                                                                      time atclient

                                                                      time atserver

                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                      Server idles P=2 times

                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                      Server idles P = minK-1Q times

                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                      TCP Latency Modeling (3)

                                                                      ementacknowledg receivesserver until

                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                      RS

                                                                      RSRTTPRTT

                                                                      RO

                                                                      RSRTT

                                                                      RSRTT

                                                                      RO

                                                                      idleTimeRTTRO

                                                                      P

                                                                      kP

                                                                      k

                                                                      P

                                                                      pp

                                                                      )12(][2

                                                                      ]2[2

                                                                      2delay

                                                                      1

                                                                      1

                                                                      1

                                                                      minusminus+++=

                                                                      minus+++=

                                                                      ++=

                                                                      minus

                                                                      =

                                                                      =

                                                                      sum

                                                                      sum

                                                                      th window after the timeidle 2 1 kRSRTT

                                                                      RS k =⎥⎦

                                                                      ⎤⎢⎣⎡ minus+

                                                                      +minus

                                                                      window kth the transmit totime2 1 =minus

                                                                      RSk

                                                                      RTT

                                                                      initiate TCPconnection

                                                                      requestobject

                                                                      first window= SR

                                                                      second window= 2SR

                                                                      third window= 4SR

                                                                      fourth window= 8SR

                                                                      completetransmissionobject

                                                                      delivered

                                                                      time atclient

                                                                      time atserver

                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                      How do we calculate K

                                                                      ⎥⎥⎤

                                                                      ⎢⎢⎡ +=

                                                                      +ge=

                                                                      geminus=

                                                                      ge+++=

                                                                      ge+++=minus

                                                                      minus

                                                                      )1(log

                                                                      )1(logmin

                                                                      12min

                                                                      222min222min

                                                                      2

                                                                      2

                                                                      110

                                                                      110

                                                                      SO

                                                                      SOkk

                                                                      SOk

                                                                      SOkOSSSkK

                                                                      k

                                                                      k

                                                                      k

                                                                      L

                                                                      L

                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                      HTTP ModelingAssume Web page consists of

                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                      02468

                                                                      101214161820

                                                                      28Kbps

                                                                      100Kbps

                                                                      1 Mbps 10Mbps

                                                                      non-persistent

                                                                      persistent

                                                                      parallel non-persistent

                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                      HTTP Response time (in seconds)

                                                                      0

                                                                      10

                                                                      20

                                                                      30

                                                                      40

                                                                      50

                                                                      60

                                                                      70

                                                                      28Kbps

                                                                      100Kbps

                                                                      1 Mbps 10Mbps

                                                                      non-persistent

                                                                      persistent

                                                                      parallel non-persistent

                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                      instantiation and implementation in the Internet

                                                                      UDPTCP

                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                      • Chapter 3 outline
                                                                      • Transport services and protocols
                                                                      • Transport vs network layer
                                                                      • Transport-layer protocols
                                                                      • Chapter 3 outline
                                                                      • Multiplexingdemultiplexing
                                                                      • Multiplexingdemultiplexing
                                                                      • How demultiplexing works
                                                                      • Connectionless demultiplexing
                                                                      • Connectionless demux (cont)
                                                                      • Connection-oriented demux
                                                                      • Connection-oriented demux (cont)
                                                                      • Connection-oriented demux Threaded Web Server
                                                                      • Chapter 3 outline
                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                      • UDP more
                                                                      • UDP checksum
                                                                      • Chapter 3 outline
                                                                      • Principles of Reliable data transfer
                                                                      • Reliable data transfer getting started
                                                                      • Reliable data transfer getting started
                                                                      • Incremental Improvements
                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                      • Rdt20 channel with bit errors
                                                                      • rdt20 FSM specification
                                                                      • rdt20 operation with no errors
                                                                      • rdt20 error scenario
                                                                      • rdt20 has a fatal flaw
                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                      • rdt21 discussion
                                                                      • rdt22 a NAK-free protocol
                                                                      • rdt22 sender receiver fragments
                                                                      • rdt30 channels with errors and loss
                                                                      • rdt30 sender
                                                                      • rdt30 in action
                                                                      • rdt30 in action
                                                                      • Performance of rdt30
                                                                      • rdt30 stop-and-wait operation
                                                                      • Pipelined protocols
                                                                      • Pipelined protocols
                                                                      • Pipelining increased utilization
                                                                      • Go-Back-N
                                                                      • GBN Sender
                                                                      • GBN sender extended FSM
                                                                      • GBN receiver extended FSM
                                                                      • More on receiver
                                                                      • GBN inaction
                                                                      • Selective Repeat
                                                                      • Selective repeat sender receiver windows
                                                                      • Selective repeat
                                                                      • Selective repeat in action
                                                                      • Selective repeat dilemma
                                                                      • Chapter 3 outline
                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                      • More TCP Details
                                                                      • Even More TCP Details
                                                                      • TCP segment structure
                                                                      • TCP seq rsquos and ACKs
                                                                      • TCP Round Trip Time and Timeout
                                                                      • TCP Round Trip Time and Timeout
                                                                      • Example RTT estimation
                                                                      • TCP Round Trip Time and Timeout
                                                                      • Chapter 3 outline
                                                                      • TCP reliable data transfer
                                                                      • TCP sender events
                                                                      • TCP sender(simplified)
                                                                      • TCP retransmission scenarios
                                                                      • TCP retransmission scenarios (more)
                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                      • More on Sender Policies
                                                                      • Fast Retransmit
                                                                      • Fast retransmit algorithm
                                                                      • TCP GBN or Selective Repeat
                                                                      • Chapter 3 outline
                                                                      • TCP Flow Control
                                                                      • TCP Flow Control
                                                                      • TCP segment structure
                                                                      • TCP Flow control how it works
                                                                      • Technical Issue
                                                                      • Chapter 3 outline
                                                                      • TCP Connection Management
                                                                      • TCP Connection Management (cont)
                                                                      • TCP Connection Management (cont)
                                                                      • TCP Connection Management (cont)
                                                                      • TCP Connection Management (cont)
                                                                      • A few special cases
                                                                      • Chapter 3 outline
                                                                      • Principles of Congestion Control
                                                                      • Causescosts of congestion scenario 1
                                                                      • Causescosts of congestion scenario 2
                                                                      • Causescosts of congestion scenario 3
                                                                      • Causescosts of congestion scenario 3
                                                                      • Approaches towards congestion control
                                                                      • Case study ATM ABR congestion control
                                                                      • Case study ATM ABR congestion control
                                                                      • Chapter 3 outline
                                                                      • TCP Congestion Control
                                                                      • TCP AIMD
                                                                      • TCP Slow Start
                                                                      • TCP Slow Start (more)
                                                                      • Summary TCP Congestion Control
                                                                      • The Big Picture
                                                                      • TCP sender congestion control
                                                                      • TCP throughput
                                                                      • TCP Futures
                                                                      • TCP Fairness
                                                                      • Why is TCP fair
                                                                      • Fairness (more)
                                                                      • TCP Latency Modeling
                                                                      • Fixed Congestion Window (W)
                                                                      • Fixed congestion window (1)
                                                                      • Fixed congestion window (2)
                                                                      • TCP Latency Modeling Slow Start (1)
                                                                      • TCP Latency Modeling Slow Start (2)
                                                                      • TCP Latency Modeling (3)
                                                                      • TCP Latency Modeling (4)
                                                                      • HTTP Modeling
                                                                      • Chapter 3 Summary

                                                                        3 Transport Layer 36Comp 361 Spring 2005

                                                                        rdt30 channels with errors and loss

                                                                        New assumptionunderlying channel can also lose packets (data or ACKs)

                                                                        checksum seq ACKs retransmissions will be of help but not enough

                                                                        Q how to deal with losssender waits until certain data or ACK lost then retransmitsyuck drawbacks

                                                                        Approach sender waits ldquoreasonablerdquo amount of time for ACK retransmits if no ACK received in this time(Retransmissions onlytriggered by timeouts)if pkt (or ACK) just delayed (not lost)

                                                                        retransmission will be duplicate but use of seq rsquos already handles thisreceiver must specify seq of pkt being ACKed

                                                                        requires countdown timer

                                                                        3 Transport Layer 37Comp 361 Spring 2005

                                                                        rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                                        rdt_send(data)

                                                                        Wait for

                                                                        ACK0

                                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                                        Wait for call 1 from

                                                                        above

                                                                        sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                                        rdt_send(data)

                                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                        rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                                        stop_timerstop_timer

                                                                        udt_send(sndpkt)start_timer

                                                                        timeout

                                                                        udt_send(sndpkt)start_timer

                                                                        timeout

                                                                        rdt_rcv(rcvpkt)

                                                                        Wait for call 0from

                                                                        above

                                                                        Wait for

                                                                        ACK1

                                                                        Λrdt_rcv(rcvpkt)

                                                                        ΛΛ

                                                                        Λ

                                                                        3 Transport Layer 38Comp 361 Spring 2005

                                                                        rdt30 in action

                                                                        3 Transport Layer 39Comp 361 Spring 2005

                                                                        rdt30 in action

                                                                        3 Transport Layer 40Comp 361 Spring 2005

                                                                        Performance of rdt30

                                                                        rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                        L (packet length in bits)R (transmission rate bps)

                                                                        8kbpkt109 bsec

                                                                        Ttransmit = = = 8 microsec

                                                                        U sender =

                                                                        00830008

                                                                        = 000027 L R RTT + L R

                                                                        =

                                                                        U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                        rdt30 stop-and-wait operation

                                                                        first packet bit transmitted t = 0

                                                                        sender receiver

                                                                        RTT

                                                                        last packet bit transmitted t = L R

                                                                        first packet bit arriveslast packet bit arrives send ACK

                                                                        ACK arrives send next packet t = RTT + L R

                                                                        U sender =

                                                                        008 30008

                                                                        = 000027 L R RTT + L R

                                                                        =

                                                                        3 Transport Layer 41Comp 361 Spring 2005

                                                                        3 Transport Layer 42Comp 361 Spring 2005

                                                                        Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                        range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                        3 Transport Layer 43Comp 361 Spring 2005

                                                                        Pipelined protocols

                                                                        Advantage much better bandwidth utilization than stop-and-wait

                                                                        Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                        Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                        Note TCP is not exactly either

                                                                        Pipelining increased utilization

                                                                        first packet bit transmitted t = 0

                                                                        sender receiver

                                                                        RTT

                                                                        last bit transmitted t = L R

                                                                        first packet bit arriveslast packet bit arrives send ACK

                                                                        ACK arrives send next packet t = RTT + L R

                                                                        last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                        U sender =

                                                                        02430008

                                                                        = 00008 3 L R RTT + L R

                                                                        =

                                                                        Increase utilizationby a factor of 3

                                                                        3 Transport Layer 44Comp 361 Spring 2005

                                                                        3 Transport Layer 45Comp 361 Spring 2005

                                                                        Go-Back-NSender

                                                                        k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                        ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                        Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                        3 Transport Layer 46Comp 361 Spring 2005

                                                                        GBN Sender

                                                                        rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                        Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                        Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                        This is only event that triggers resend

                                                                        3 Transport Layer 47Comp 361 Spring 2005

                                                                        GBN sender extended FSMrdt_send(data)

                                                                        Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                        timeout

                                                                        if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                        start_timernextseqnum++

                                                                        elserefuse_data(data)

                                                                        base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                        stop_timerelse

                                                                        start_timer

                                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                        base=1nextseqnum=1

                                                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                        Λ

                                                                        3 Transport Layer 48Comp 361 Spring 2005

                                                                        GBN receiver extended FSM

                                                                        Wait

                                                                        udt_send(sndpkt)default

                                                                        rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                        expectedseqnum=1sndpkt =

                                                                        make_pkt(0ACKchksum)

                                                                        Λ

                                                                        If expected packet receivedSend ACK and deliver packet upstairs

                                                                        If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                        3 Transport Layer 49Comp 361 Spring 2005

                                                                        More on receiver

                                                                        The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                        3 Transport Layer 50Comp 361 Spring 2005

                                                                        GBN inaction

                                                                        GBN is easy to code but might have performance problems

                                                                        In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                        Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                        3 Transport Layer 51Comp 361 Spring 2005

                                                                        3 Transport Layer 52Comp 361 Spring 2005

                                                                        Selective Repeat

                                                                        receiver individually acknowledges all correctly received pkts

                                                                        buffers pkts as needed for eventual in-order delivery to upper layer

                                                                        sender only resends pkts for which ACK not received

                                                                        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                        3 Transport Layer 53Comp 361 Spring 2005

                                                                        Selective repeat sender receiver windows

                                                                        3 Transport Layer 54Comp 361 Spring 2005

                                                                        Selective repeat

                                                                        pkt n in [rcvbase rcvbase+N-1]

                                                                        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                        pkt n in [rcvbase-Nrcvbase-1]

                                                                        ACK(n) (note this is a reACK)

                                                                        otherwiseignore

                                                                        receiverdata from above

                                                                        if next available seq in window send pkt

                                                                        timeout(n)resend pkt n restart timer

                                                                        ACK(n) in [sendbasesendbase+N]

                                                                        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                        sender

                                                                        3 Transport Layer 55Comp 361 Spring 2005

                                                                        Selective repeat in action

                                                                        3 Transport Layer 56Comp 361 Spring 2005

                                                                        Selective repeatdilemma

                                                                        Example seq rsquos 0 1 2 3window size=3

                                                                        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                        Q what is relationship between seq size and window size

                                                                        3 Transport Layer 57Comp 361 Spring 2005

                                                                        Chapter 3 outline

                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                        35 Connection-oriented transport TCP

                                                                        segment structurereliable data transferflow controlconnection management

                                                                        36 Principles of congestion control37 TCP congestion control

                                                                        3 Transport Layer 58Comp 361 Spring 2005

                                                                        TCP Overview RFCs 793 1122 1323 2018 2581

                                                                        full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                        flow controlledsender will not overwhelm receiver

                                                                        point-to-pointone sender one receiver

                                                                        reliable in-order byte steam

                                                                        no ldquomessage boundariesrdquopipelined

                                                                        TCP congestion and flow control set window size

                                                                        send amp receive buffers

                                                                        socketdoor

                                                                        TCPsend buffer

                                                                        TCPreceive buffer

                                                                        socketdoor

                                                                        segment

                                                                        applicationwrites data

                                                                        applicationreads data

                                                                        3 Transport Layer 59Comp 361 Spring 2005

                                                                        More TCP DetailsMaximum Segment Size (MSS)

                                                                        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                        Application Data + TCP Header = TCP Segment

                                                                        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                        (again no payload)Client responds with third special segment

                                                                        This can contain payload

                                                                        3 Transport Layer 60Comp 361 Spring 2005

                                                                        Even More TCP Details

                                                                        A TCP connection between client and server creates in both client and server

                                                                        (i) buffers(ii) variables and

                                                                        (iii) a socket connection to process

                                                                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                        any of the network elements between the host and server

                                                                        3 Transport Layer 61Comp 361 Spring 2005

                                                                        TCP segment structure

                                                                        source port dest port

                                                                        32 bits

                                                                        applicationdata

                                                                        (variable length)

                                                                        sequence numberacknowledgement number

                                                                        Receive windowUrg data pnterchecksum

                                                                        FSRPAUheadlen

                                                                        notused

                                                                        Options (variable length)

                                                                        URG urgent data (generally not used)

                                                                        ACK ACK valid

                                                                        PSH push data now(generally not used)

                                                                        RST SYN FINconnection estab(setup teardown

                                                                        commands)

                                                                        bytes rcvr willingto accept

                                                                        Internetchecksum

                                                                        (as in UDP)

                                                                        countingby bytes of data(not segments)

                                                                        3 Transport Layer 62Comp 361 Spring 2005

                                                                        TCP seq rsquos and ACKsSeq rsquos

                                                                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                        ACKsseq of next byte expected from other sidecumulative ACK

                                                                        Q how receiver handles out-of-order segments

                                                                        A TCP spec doesnrsquot say - up to implementer

                                                                        Host BHost A

                                                                        Seq=42 ACK=79 data = lsquoCrsquo

                                                                        Seq=79 ACK=43 data = lsquoCrsquo

                                                                        Seq=43 ACK=80

                                                                        Usertypes

                                                                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                        back lsquoCrsquo

                                                                        host ACKsreceipt

                                                                        of echoedlsquoCrsquo

                                                                        timesimple telnet scenario

                                                                        3 Transport Layer 63Comp 361 Spring 2005

                                                                        TCP Round Trip Time and Timeout

                                                                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                        average several recent measurements not just current SampleRTT

                                                                        Q how to set TCP timeout valuelonger than RTT

                                                                        but RTT variestoo short premature timeout

                                                                        unnecessary retransmissions

                                                                        too long slow reaction to segment loss

                                                                        3 Transport Layer 64Comp 361 Spring 2005

                                                                        TCP Round Trip Time and Timeout

                                                                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                        3 Transport Layer 65Comp 361 Spring 2005

                                                                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                        100

                                                                        150

                                                                        200

                                                                        250

                                                                        300

                                                                        350

                                                                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                        time (seconnds)

                                                                        RTT

                                                                        (mill

                                                                        iseco

                                                                        nds)

                                                                        SampleRTT Estimated RTT

                                                                        3 Transport Layer 66Comp 361 Spring 2005

                                                                        TCP Round Trip Time and Timeout

                                                                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                        (typically β = 025)

                                                                        Then set timeout interval

                                                                        TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                        3 Transport Layer 67Comp 361 Spring 2005

                                                                        Chapter 3 outline

                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                        35 Connection-oriented transport TCP

                                                                        segment structurereliable data transferflow controlconnection management

                                                                        36 Principles of congestion control37 TCP congestion control

                                                                        3 Transport Layer 68Comp 361 Spring 2005

                                                                        TCP reliable data transfer

                                                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                        Retransmissions are triggered by

                                                                        timeout eventsduplicate acks

                                                                        Initially consider simplified TCP sender

                                                                        ignore duplicate acksignore flow control congestion control

                                                                        3 Transport Layer 69Comp 361 Spring 2005

                                                                        TCP sender eventsdata rcvd from app

                                                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                        timeoutretransmit segment that caused timeoutrestart timer

                                                                        Ack rcvdIf acknowledges previously unackedsegments

                                                                        update what is known to be ackedstart timer if there are outstanding segments

                                                                        TCP sender(simplified)

                                                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                        loop (forever) switch(event)

                                                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                                                        smallest sequence numberstart timer

                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                        start timer

                                                                        end of loop forever

                                                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                        3 Transport Layer 70Comp 361 Spring 2005

                                                                        3 Transport Layer 71Comp 361 Spring 2005

                                                                        TCP retransmission scenariosHost A

                                                                        Seq=100 20 bytes data

                                                                        ACK=100

                                                                        timepremature timeout

                                                                        Host B

                                                                        Seq=92 8 bytes data

                                                                        ACK=120

                                                                        Seq=92 8 bytes data

                                                                        Seq=

                                                                        92 t

                                                                        imeo

                                                                        ut

                                                                        ACK=120

                                                                        Host A

                                                                        Seq=92 8 bytes data

                                                                        ACK=100

                                                                        loss

                                                                        tim

                                                                        eout

                                                                        lost ACK scenario

                                                                        Host B

                                                                        X

                                                                        Seq=92 8 bytes data

                                                                        ACK=100

                                                                        time

                                                                        SendBase= 120

                                                                        SendBase= 120

                                                                        Sendbase= 100

                                                                        Seq=

                                                                        92 t

                                                                        imeo

                                                                        utSendBase

                                                                        = 100

                                                                        3 Transport Layer 72Comp 361 Spring 2005

                                                                        TCP retransmission scenarios (more)Host A

                                                                        Seq=92 8 bytes data

                                                                        ACK=100

                                                                        loss

                                                                        tim

                                                                        eout

                                                                        Cumulative ACK scenario

                                                                        Host B

                                                                        X

                                                                        Seq=100 20 bytes data

                                                                        ACK=120

                                                                        time

                                                                        SendBase= 120

                                                                        3 Transport Layer 73Comp 361 Spring 2005

                                                                        TCP ACK generation [RFC 1122 RFC 2581]

                                                                        Event at Receiver

                                                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                        Arrival of segment that partially or completely fills gap

                                                                        TCP Receiver action

                                                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                        Immediately send single cumulative ACK ACKing both in-order segments

                                                                        Immediately send duplicate ACK indicating seq of next expected byte

                                                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                                                        3 Transport Layer 74Comp 361 Spring 2005

                                                                        More on Sender Policies

                                                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                        3 Transport Layer 75Comp 361 Spring 2005

                                                                        Fast Retransmit

                                                                        Time-out period often relatively long

                                                                        long delay before resending lost packet

                                                                        Detect lost segments via duplicate ACKs

                                                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                        fast retransmit resend segment before timer expires

                                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                                        Fast retransmit algorithm

                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                        start timer

                                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                        resend segment with sequence number y

                                                                        a duplicate ACK for already ACKed segment

                                                                        fast retransmit

                                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                                        TCP GBN or Selective Repeat

                                                                        Basic TCP looks a lot like GBN

                                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                        This looks a lot like Selective Repeat

                                                                        TCP is a hybrid

                                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                                        Chapter 3 outline

                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                        35 Connection-oriented transport TCP

                                                                        segment structurereliable data transferflow controlconnection management

                                                                        36 Principles of congestion control37 TCP congestion control

                                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                                        TCP Flow Control

                                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                        transmitting too muchtoo fast

                                                                        flow controlreceive side of TCP connection has a receive buffer

                                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                        app process may be slow at reading from buffer

                                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                                        TCP segment structure

                                                                        source port dest port

                                                                        32 bits

                                                                        applicationdata

                                                                        (variable length)

                                                                        sequence numberacknowledgement number

                                                                        Receive windowUrg data pnterchecksum

                                                                        FSRPAUheadlen

                                                                        notused

                                                                        Options (variable length)

                                                                        URG urgent data (generally not used)

                                                                        ACK ACK valid

                                                                        PSH push data now(generally not used)

                                                                        RST SYN FINconnection estab(setup teardown

                                                                        commands)

                                                                        bytes rcvr willingto accept

                                                                        Internetchecksum

                                                                        (as in UDP)

                                                                        countingby bytes of data(not segments)

                                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                                        TCP Flow control how it works

                                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                        LastByteRead]

                                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                        guarantees receive buffer doesnrsquot overflow

                                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                                        Technical Issue

                                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                        Note on UDP

                                                                        UDP has no flow control

                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                        Chapter 3 outline

                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                        35 Connection-oriented transport TCP

                                                                        segment structurereliable data transferflow controlconnection management

                                                                        36 Principles of congestion control37 TCP congestion control

                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                        TCP Connection Management

                                                                        Three way handshakeStep 1 client end system sends

                                                                        TCP SYN control segment to server

                                                                        specifies client_isn the initial seq No application data

                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                        TCP Connection Management (cont)

                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                        Allocate buffersAllocates buffersCan include application data

                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                        server

                                                                        Connection granted (SYN=1 server_isn

                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                        ack=client_isn+1)

                                                                        ack=server_isn+1

                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                        TCP Connection Management (cont)

                                                                        Closing a connection

                                                                        client closes socketclientSocketclose()

                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                        client

                                                                        FIN

                                                                        server

                                                                        ACK

                                                                        ACK

                                                                        FIN

                                                                        close

                                                                        close

                                                                        closed

                                                                        tim

                                                                        ed w

                                                                        ait

                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                        TCP Connection Management (cont)

                                                                        Step 3 client receives FIN replies with ACK

                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                        Closes down after timed-wait

                                                                        Step 4 server receives ACK Connection closed

                                                                        Note with small modification can handle simultaneous FINs

                                                                        client

                                                                        FIN

                                                                        server

                                                                        ACK

                                                                        ACK

                                                                        FIN

                                                                        closing

                                                                        closing

                                                                        closed

                                                                        tim

                                                                        ed w

                                                                        ait

                                                                        closed

                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                        TCP Connection Management (cont)

                                                                        ExampleTCP serverlifecycle

                                                                        Example TCP clientlifecycle

                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                        A few special cases

                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                        Chapter 3 outline

                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                        35 Connection-oriented transport TCP

                                                                        segment structurereliable data transferflow controlconnection management

                                                                        36 Principles of congestion control37 TCP congestion control

                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                        Principles of Congestion Control

                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                        a top-10 problem

                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                        large delays when congestedmaximum achievable throughput

                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                        Causescosts of congestion scenario 2

                                                                        one router finite buffers sender retransmission of lost packet

                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                        λin λout=

                                                                        λin λoutgtλ

                                                                        inλout

                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                        (c)(a) (b)

                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                        λin

                                                                        Q what happens as and increase λ

                                                                        in

                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                        Causescosts of congestion scenario 3

                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                        Approaches towards congestion control

                                                                        Two broad approaches towards congestion control

                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                        Case study ATM ABR congestion control

                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                        RM cells returned to sender by receiver with bits intact

                                                                        small exception ndash see next page

                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                        sender should use available bandwidth

                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                        Case study ATM ABR congestion control

                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                        Chapter 3 outline

                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                        35 Connection-oriented transport TCP

                                                                        segment structurereliable data transferflow controlconnection management

                                                                        36 Principles of congestion control37 TCP congestion control

                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                        Congwin

                                                                        w segments each with MSS bytes sent in one RTT

                                                                        throughput = w MSSRTT Bytessec

                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                        LastByteSent-LastByteAcked le CongWin

                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                        cut CongWin in half after loss event

                                                                        8 Kbytes

                                                                        16 Kbytes

                                                                        24 Kbytes

                                                                        time

                                                                        congestionwindow

                                                                        Long-lived TCP connection

                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                        TCP Slow Start

                                                                        When connection begins CongWin = 1 MSS

                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                        available bandwidth may be gtgt MSSRTT

                                                                        desirable to quickly ramp up to respectable rate

                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                        TCP Slow Start (more)

                                                                        When connection begins increase rate exponentially until first loss event

                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                        Host A

                                                                        one segment

                                                                        RTT

                                                                        Host B

                                                                        time

                                                                        two segments

                                                                        four segments

                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                        Summary TCP Congestion Control

                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                        The Big Picture

                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                        ACK receipt for previously unackeddata

                                                                        Slow Start (SS)

                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                        Resulting in a doubling of CongWin every RTT

                                                                        ACK receipt for previously unackeddata

                                                                        CongestionAvoidance (CA)

                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                        Loss event detected by triple duplicate ACK

                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                        Enter slow start

                                                                        Duplicate ACK

                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                        CongWin and Threshold not changed

                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                        TCP throughput

                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                        TCP Futures

                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                        LRTTMSSsdot221

                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                        TCP connection 1

                                                                        bottleneckrouter

                                                                        capacity R

                                                                        TCP connection 2

                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                        Why is TCP fairTwo competing sessions

                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                        R

                                                                        R

                                                                        equal bandwidth share

                                                                        Connection 1 throughput

                                                                        Conn

                                                                        ecti

                                                                        on 2

                                                                        thr

                                                                        ough

                                                                        p ut

                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                        Fairness (more)Fairness and UDP

                                                                        Multimedia apps often do not use TCP

                                                                        do not want rate throttled by congestion control

                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                        Current Research area How to keep UDP from congesting the internet

                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                        TCP Latency ModelingNotation assumptions

                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                        modeling slow start

                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                        Fixed Congestion Window (W)Two cases

                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                        Fixed congestion window (1)

                                                                        First caseWSR gt RTT + SR ACK for

                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                        latency = 2RTT + OR

                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                        Fixed congestion window (2)

                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                        TCP Latency Modeling Slow Start (1)

                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                        Will show that the delay for one object is

                                                                        RS

                                                                        RSRTTP

                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                        ⎤⎢⎣⎡ +++=

                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                        - and K is the number of windows that cover the object

                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                        TCP Latency Modeling Slow Start (2)

                                                                        RTT

                                                                        initiate TCPconnection

                                                                        requestobject

                                                                        first window= SR

                                                                        second window= 2SR

                                                                        third window= 4SR

                                                                        fourth window= 8SR

                                                                        completetransmissionobject

                                                                        delivered

                                                                        time atclient

                                                                        time atserver

                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                        Server idles P=2 times

                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                        Server idles P = minK-1Q times

                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                        TCP Latency Modeling (3)

                                                                        ementacknowledg receivesserver until

                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                        RS

                                                                        RSRTTPRTT

                                                                        RO

                                                                        RSRTT

                                                                        RSRTT

                                                                        RO

                                                                        idleTimeRTTRO

                                                                        P

                                                                        kP

                                                                        k

                                                                        P

                                                                        pp

                                                                        )12(][2

                                                                        ]2[2

                                                                        2delay

                                                                        1

                                                                        1

                                                                        1

                                                                        minusminus+++=

                                                                        minus+++=

                                                                        ++=

                                                                        minus

                                                                        =

                                                                        =

                                                                        sum

                                                                        sum

                                                                        th window after the timeidle 2 1 kRSRTT

                                                                        RS k =⎥⎦

                                                                        ⎤⎢⎣⎡ minus+

                                                                        +minus

                                                                        window kth the transmit totime2 1 =minus

                                                                        RSk

                                                                        RTT

                                                                        initiate TCPconnection

                                                                        requestobject

                                                                        first window= SR

                                                                        second window= 2SR

                                                                        third window= 4SR

                                                                        fourth window= 8SR

                                                                        completetransmissionobject

                                                                        delivered

                                                                        time atclient

                                                                        time atserver

                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                        How do we calculate K

                                                                        ⎥⎥⎤

                                                                        ⎢⎢⎡ +=

                                                                        +ge=

                                                                        geminus=

                                                                        ge+++=

                                                                        ge+++=minus

                                                                        minus

                                                                        )1(log

                                                                        )1(logmin

                                                                        12min

                                                                        222min222min

                                                                        2

                                                                        2

                                                                        110

                                                                        110

                                                                        SO

                                                                        SOkk

                                                                        SOk

                                                                        SOkOSSSkK

                                                                        k

                                                                        k

                                                                        k

                                                                        L

                                                                        L

                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                        HTTP ModelingAssume Web page consists of

                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                        02468

                                                                        101214161820

                                                                        28Kbps

                                                                        100Kbps

                                                                        1 Mbps 10Mbps

                                                                        non-persistent

                                                                        persistent

                                                                        parallel non-persistent

                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                        HTTP Response time (in seconds)

                                                                        0

                                                                        10

                                                                        20

                                                                        30

                                                                        40

                                                                        50

                                                                        60

                                                                        70

                                                                        28Kbps

                                                                        100Kbps

                                                                        1 Mbps 10Mbps

                                                                        non-persistent

                                                                        persistent

                                                                        parallel non-persistent

                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                        instantiation and implementation in the Internet

                                                                        UDPTCP

                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                        • Chapter 3 outline
                                                                        • Transport services and protocols
                                                                        • Transport vs network layer
                                                                        • Transport-layer protocols
                                                                        • Chapter 3 outline
                                                                        • Multiplexingdemultiplexing
                                                                        • Multiplexingdemultiplexing
                                                                        • How demultiplexing works
                                                                        • Connectionless demultiplexing
                                                                        • Connectionless demux (cont)
                                                                        • Connection-oriented demux
                                                                        • Connection-oriented demux (cont)
                                                                        • Connection-oriented demux Threaded Web Server
                                                                        • Chapter 3 outline
                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                        • UDP more
                                                                        • UDP checksum
                                                                        • Chapter 3 outline
                                                                        • Principles of Reliable data transfer
                                                                        • Reliable data transfer getting started
                                                                        • Reliable data transfer getting started
                                                                        • Incremental Improvements
                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                        • Rdt20 channel with bit errors
                                                                        • rdt20 FSM specification
                                                                        • rdt20 operation with no errors
                                                                        • rdt20 error scenario
                                                                        • rdt20 has a fatal flaw
                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                        • rdt21 discussion
                                                                        • rdt22 a NAK-free protocol
                                                                        • rdt22 sender receiver fragments
                                                                        • rdt30 channels with errors and loss
                                                                        • rdt30 sender
                                                                        • rdt30 in action
                                                                        • rdt30 in action
                                                                        • Performance of rdt30
                                                                        • rdt30 stop-and-wait operation
                                                                        • Pipelined protocols
                                                                        • Pipelined protocols
                                                                        • Pipelining increased utilization
                                                                        • Go-Back-N
                                                                        • GBN Sender
                                                                        • GBN sender extended FSM
                                                                        • GBN receiver extended FSM
                                                                        • More on receiver
                                                                        • GBN inaction
                                                                        • Selective Repeat
                                                                        • Selective repeat sender receiver windows
                                                                        • Selective repeat
                                                                        • Selective repeat in action
                                                                        • Selective repeat dilemma
                                                                        • Chapter 3 outline
                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                        • More TCP Details
                                                                        • Even More TCP Details
                                                                        • TCP segment structure
                                                                        • TCP seq rsquos and ACKs
                                                                        • TCP Round Trip Time and Timeout
                                                                        • TCP Round Trip Time and Timeout
                                                                        • Example RTT estimation
                                                                        • TCP Round Trip Time and Timeout
                                                                        • Chapter 3 outline
                                                                        • TCP reliable data transfer
                                                                        • TCP sender events
                                                                        • TCP sender(simplified)
                                                                        • TCP retransmission scenarios
                                                                        • TCP retransmission scenarios (more)
                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                        • More on Sender Policies
                                                                        • Fast Retransmit
                                                                        • Fast retransmit algorithm
                                                                        • TCP GBN or Selective Repeat
                                                                        • Chapter 3 outline
                                                                        • TCP Flow Control
                                                                        • TCP Flow Control
                                                                        • TCP segment structure
                                                                        • TCP Flow control how it works
                                                                        • Technical Issue
                                                                        • Chapter 3 outline
                                                                        • TCP Connection Management
                                                                        • TCP Connection Management (cont)
                                                                        • TCP Connection Management (cont)
                                                                        • TCP Connection Management (cont)
                                                                        • TCP Connection Management (cont)
                                                                        • A few special cases
                                                                        • Chapter 3 outline
                                                                        • Principles of Congestion Control
                                                                        • Causescosts of congestion scenario 1
                                                                        • Causescosts of congestion scenario 2
                                                                        • Causescosts of congestion scenario 3
                                                                        • Causescosts of congestion scenario 3
                                                                        • Approaches towards congestion control
                                                                        • Case study ATM ABR congestion control
                                                                        • Case study ATM ABR congestion control
                                                                        • Chapter 3 outline
                                                                        • TCP Congestion Control
                                                                        • TCP AIMD
                                                                        • TCP Slow Start
                                                                        • TCP Slow Start (more)
                                                                        • Summary TCP Congestion Control
                                                                        • The Big Picture
                                                                        • TCP sender congestion control
                                                                        • TCP throughput
                                                                        • TCP Futures
                                                                        • TCP Fairness
                                                                        • Why is TCP fair
                                                                        • Fairness (more)
                                                                        • TCP Latency Modeling
                                                                        • Fixed Congestion Window (W)
                                                                        • Fixed congestion window (1)
                                                                        • Fixed congestion window (2)
                                                                        • TCP Latency Modeling Slow Start (1)
                                                                        • TCP Latency Modeling Slow Start (2)
                                                                        • TCP Latency Modeling (3)
                                                                        • TCP Latency Modeling (4)
                                                                        • HTTP Modeling
                                                                        • Chapter 3 Summary

                                                                          3 Transport Layer 37Comp 361 Spring 2005

                                                                          rdt30 sendersndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer

                                                                          rdt_send(data)

                                                                          Wait for

                                                                          ACK0

                                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )

                                                                          Wait for call 1 from

                                                                          above

                                                                          sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer

                                                                          rdt_send(data)

                                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)

                                                                          rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )

                                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)

                                                                          stop_timerstop_timer

                                                                          udt_send(sndpkt)start_timer

                                                                          timeout

                                                                          udt_send(sndpkt)start_timer

                                                                          timeout

                                                                          rdt_rcv(rcvpkt)

                                                                          Wait for call 0from

                                                                          above

                                                                          Wait for

                                                                          ACK1

                                                                          Λrdt_rcv(rcvpkt)

                                                                          ΛΛ

                                                                          Λ

                                                                          3 Transport Layer 38Comp 361 Spring 2005

                                                                          rdt30 in action

                                                                          3 Transport Layer 39Comp 361 Spring 2005

                                                                          rdt30 in action

                                                                          3 Transport Layer 40Comp 361 Spring 2005

                                                                          Performance of rdt30

                                                                          rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                          L (packet length in bits)R (transmission rate bps)

                                                                          8kbpkt109 bsec

                                                                          Ttransmit = = = 8 microsec

                                                                          U sender =

                                                                          00830008

                                                                          = 000027 L R RTT + L R

                                                                          =

                                                                          U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                          rdt30 stop-and-wait operation

                                                                          first packet bit transmitted t = 0

                                                                          sender receiver

                                                                          RTT

                                                                          last packet bit transmitted t = L R

                                                                          first packet bit arriveslast packet bit arrives send ACK

                                                                          ACK arrives send next packet t = RTT + L R

                                                                          U sender =

                                                                          008 30008

                                                                          = 000027 L R RTT + L R

                                                                          =

                                                                          3 Transport Layer 41Comp 361 Spring 2005

                                                                          3 Transport Layer 42Comp 361 Spring 2005

                                                                          Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                          range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                          3 Transport Layer 43Comp 361 Spring 2005

                                                                          Pipelined protocols

                                                                          Advantage much better bandwidth utilization than stop-and-wait

                                                                          Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                          Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                          Note TCP is not exactly either

                                                                          Pipelining increased utilization

                                                                          first packet bit transmitted t = 0

                                                                          sender receiver

                                                                          RTT

                                                                          last bit transmitted t = L R

                                                                          first packet bit arriveslast packet bit arrives send ACK

                                                                          ACK arrives send next packet t = RTT + L R

                                                                          last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                          U sender =

                                                                          02430008

                                                                          = 00008 3 L R RTT + L R

                                                                          =

                                                                          Increase utilizationby a factor of 3

                                                                          3 Transport Layer 44Comp 361 Spring 2005

                                                                          3 Transport Layer 45Comp 361 Spring 2005

                                                                          Go-Back-NSender

                                                                          k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                          ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                          Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                          3 Transport Layer 46Comp 361 Spring 2005

                                                                          GBN Sender

                                                                          rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                          Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                          Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                          This is only event that triggers resend

                                                                          3 Transport Layer 47Comp 361 Spring 2005

                                                                          GBN sender extended FSMrdt_send(data)

                                                                          Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                          timeout

                                                                          if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                          start_timernextseqnum++

                                                                          elserefuse_data(data)

                                                                          base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                          stop_timerelse

                                                                          start_timer

                                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                          base=1nextseqnum=1

                                                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                          Λ

                                                                          3 Transport Layer 48Comp 361 Spring 2005

                                                                          GBN receiver extended FSM

                                                                          Wait

                                                                          udt_send(sndpkt)default

                                                                          rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                          expectedseqnum=1sndpkt =

                                                                          make_pkt(0ACKchksum)

                                                                          Λ

                                                                          If expected packet receivedSend ACK and deliver packet upstairs

                                                                          If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                          3 Transport Layer 49Comp 361 Spring 2005

                                                                          More on receiver

                                                                          The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                          3 Transport Layer 50Comp 361 Spring 2005

                                                                          GBN inaction

                                                                          GBN is easy to code but might have performance problems

                                                                          In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                          Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                          3 Transport Layer 51Comp 361 Spring 2005

                                                                          3 Transport Layer 52Comp 361 Spring 2005

                                                                          Selective Repeat

                                                                          receiver individually acknowledges all correctly received pkts

                                                                          buffers pkts as needed for eventual in-order delivery to upper layer

                                                                          sender only resends pkts for which ACK not received

                                                                          sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                          sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                          3 Transport Layer 53Comp 361 Spring 2005

                                                                          Selective repeat sender receiver windows

                                                                          3 Transport Layer 54Comp 361 Spring 2005

                                                                          Selective repeat

                                                                          pkt n in [rcvbase rcvbase+N-1]

                                                                          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                          pkt n in [rcvbase-Nrcvbase-1]

                                                                          ACK(n) (note this is a reACK)

                                                                          otherwiseignore

                                                                          receiverdata from above

                                                                          if next available seq in window send pkt

                                                                          timeout(n)resend pkt n restart timer

                                                                          ACK(n) in [sendbasesendbase+N]

                                                                          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                          sender

                                                                          3 Transport Layer 55Comp 361 Spring 2005

                                                                          Selective repeat in action

                                                                          3 Transport Layer 56Comp 361 Spring 2005

                                                                          Selective repeatdilemma

                                                                          Example seq rsquos 0 1 2 3window size=3

                                                                          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                          Q what is relationship between seq size and window size

                                                                          3 Transport Layer 57Comp 361 Spring 2005

                                                                          Chapter 3 outline

                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                          35 Connection-oriented transport TCP

                                                                          segment structurereliable data transferflow controlconnection management

                                                                          36 Principles of congestion control37 TCP congestion control

                                                                          3 Transport Layer 58Comp 361 Spring 2005

                                                                          TCP Overview RFCs 793 1122 1323 2018 2581

                                                                          full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                          flow controlledsender will not overwhelm receiver

                                                                          point-to-pointone sender one receiver

                                                                          reliable in-order byte steam

                                                                          no ldquomessage boundariesrdquopipelined

                                                                          TCP congestion and flow control set window size

                                                                          send amp receive buffers

                                                                          socketdoor

                                                                          TCPsend buffer

                                                                          TCPreceive buffer

                                                                          socketdoor

                                                                          segment

                                                                          applicationwrites data

                                                                          applicationreads data

                                                                          3 Transport Layer 59Comp 361 Spring 2005

                                                                          More TCP DetailsMaximum Segment Size (MSS)

                                                                          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                          Application Data + TCP Header = TCP Segment

                                                                          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                          (again no payload)Client responds with third special segment

                                                                          This can contain payload

                                                                          3 Transport Layer 60Comp 361 Spring 2005

                                                                          Even More TCP Details

                                                                          A TCP connection between client and server creates in both client and server

                                                                          (i) buffers(ii) variables and

                                                                          (iii) a socket connection to process

                                                                          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                          any of the network elements between the host and server

                                                                          3 Transport Layer 61Comp 361 Spring 2005

                                                                          TCP segment structure

                                                                          source port dest port

                                                                          32 bits

                                                                          applicationdata

                                                                          (variable length)

                                                                          sequence numberacknowledgement number

                                                                          Receive windowUrg data pnterchecksum

                                                                          FSRPAUheadlen

                                                                          notused

                                                                          Options (variable length)

                                                                          URG urgent data (generally not used)

                                                                          ACK ACK valid

                                                                          PSH push data now(generally not used)

                                                                          RST SYN FINconnection estab(setup teardown

                                                                          commands)

                                                                          bytes rcvr willingto accept

                                                                          Internetchecksum

                                                                          (as in UDP)

                                                                          countingby bytes of data(not segments)

                                                                          3 Transport Layer 62Comp 361 Spring 2005

                                                                          TCP seq rsquos and ACKsSeq rsquos

                                                                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                          ACKsseq of next byte expected from other sidecumulative ACK

                                                                          Q how receiver handles out-of-order segments

                                                                          A TCP spec doesnrsquot say - up to implementer

                                                                          Host BHost A

                                                                          Seq=42 ACK=79 data = lsquoCrsquo

                                                                          Seq=79 ACK=43 data = lsquoCrsquo

                                                                          Seq=43 ACK=80

                                                                          Usertypes

                                                                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                          back lsquoCrsquo

                                                                          host ACKsreceipt

                                                                          of echoedlsquoCrsquo

                                                                          timesimple telnet scenario

                                                                          3 Transport Layer 63Comp 361 Spring 2005

                                                                          TCP Round Trip Time and Timeout

                                                                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                          average several recent measurements not just current SampleRTT

                                                                          Q how to set TCP timeout valuelonger than RTT

                                                                          but RTT variestoo short premature timeout

                                                                          unnecessary retransmissions

                                                                          too long slow reaction to segment loss

                                                                          3 Transport Layer 64Comp 361 Spring 2005

                                                                          TCP Round Trip Time and Timeout

                                                                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                          3 Transport Layer 65Comp 361 Spring 2005

                                                                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                          100

                                                                          150

                                                                          200

                                                                          250

                                                                          300

                                                                          350

                                                                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                          time (seconnds)

                                                                          RTT

                                                                          (mill

                                                                          iseco

                                                                          nds)

                                                                          SampleRTT Estimated RTT

                                                                          3 Transport Layer 66Comp 361 Spring 2005

                                                                          TCP Round Trip Time and Timeout

                                                                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                          (typically β = 025)

                                                                          Then set timeout interval

                                                                          TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                          3 Transport Layer 67Comp 361 Spring 2005

                                                                          Chapter 3 outline

                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                          35 Connection-oriented transport TCP

                                                                          segment structurereliable data transferflow controlconnection management

                                                                          36 Principles of congestion control37 TCP congestion control

                                                                          3 Transport Layer 68Comp 361 Spring 2005

                                                                          TCP reliable data transfer

                                                                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                          Retransmissions are triggered by

                                                                          timeout eventsduplicate acks

                                                                          Initially consider simplified TCP sender

                                                                          ignore duplicate acksignore flow control congestion control

                                                                          3 Transport Layer 69Comp 361 Spring 2005

                                                                          TCP sender eventsdata rcvd from app

                                                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                          timeoutretransmit segment that caused timeoutrestart timer

                                                                          Ack rcvdIf acknowledges previously unackedsegments

                                                                          update what is known to be ackedstart timer if there are outstanding segments

                                                                          TCP sender(simplified)

                                                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                          loop (forever) switch(event)

                                                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                                                          smallest sequence numberstart timer

                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                          start timer

                                                                          end of loop forever

                                                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                          3 Transport Layer 70Comp 361 Spring 2005

                                                                          3 Transport Layer 71Comp 361 Spring 2005

                                                                          TCP retransmission scenariosHost A

                                                                          Seq=100 20 bytes data

                                                                          ACK=100

                                                                          timepremature timeout

                                                                          Host B

                                                                          Seq=92 8 bytes data

                                                                          ACK=120

                                                                          Seq=92 8 bytes data

                                                                          Seq=

                                                                          92 t

                                                                          imeo

                                                                          ut

                                                                          ACK=120

                                                                          Host A

                                                                          Seq=92 8 bytes data

                                                                          ACK=100

                                                                          loss

                                                                          tim

                                                                          eout

                                                                          lost ACK scenario

                                                                          Host B

                                                                          X

                                                                          Seq=92 8 bytes data

                                                                          ACK=100

                                                                          time

                                                                          SendBase= 120

                                                                          SendBase= 120

                                                                          Sendbase= 100

                                                                          Seq=

                                                                          92 t

                                                                          imeo

                                                                          utSendBase

                                                                          = 100

                                                                          3 Transport Layer 72Comp 361 Spring 2005

                                                                          TCP retransmission scenarios (more)Host A

                                                                          Seq=92 8 bytes data

                                                                          ACK=100

                                                                          loss

                                                                          tim

                                                                          eout

                                                                          Cumulative ACK scenario

                                                                          Host B

                                                                          X

                                                                          Seq=100 20 bytes data

                                                                          ACK=120

                                                                          time

                                                                          SendBase= 120

                                                                          3 Transport Layer 73Comp 361 Spring 2005

                                                                          TCP ACK generation [RFC 1122 RFC 2581]

                                                                          Event at Receiver

                                                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                          Arrival of segment that partially or completely fills gap

                                                                          TCP Receiver action

                                                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                          Immediately send single cumulative ACK ACKing both in-order segments

                                                                          Immediately send duplicate ACK indicating seq of next expected byte

                                                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                                                          3 Transport Layer 74Comp 361 Spring 2005

                                                                          More on Sender Policies

                                                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                          3 Transport Layer 75Comp 361 Spring 2005

                                                                          Fast Retransmit

                                                                          Time-out period often relatively long

                                                                          long delay before resending lost packet

                                                                          Detect lost segments via duplicate ACKs

                                                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                          fast retransmit resend segment before timer expires

                                                                          3 Transport Layer 76Comp 361 Spring 2005

                                                                          Fast retransmit algorithm

                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                          start timer

                                                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                          resend segment with sequence number y

                                                                          a duplicate ACK for already ACKed segment

                                                                          fast retransmit

                                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                                          TCP GBN or Selective Repeat

                                                                          Basic TCP looks a lot like GBN

                                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                          This looks a lot like Selective Repeat

                                                                          TCP is a hybrid

                                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                                          Chapter 3 outline

                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                          35 Connection-oriented transport TCP

                                                                          segment structurereliable data transferflow controlconnection management

                                                                          36 Principles of congestion control37 TCP congestion control

                                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                                          TCP Flow Control

                                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                          transmitting too muchtoo fast

                                                                          flow controlreceive side of TCP connection has a receive buffer

                                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                          app process may be slow at reading from buffer

                                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                                          TCP segment structure

                                                                          source port dest port

                                                                          32 bits

                                                                          applicationdata

                                                                          (variable length)

                                                                          sequence numberacknowledgement number

                                                                          Receive windowUrg data pnterchecksum

                                                                          FSRPAUheadlen

                                                                          notused

                                                                          Options (variable length)

                                                                          URG urgent data (generally not used)

                                                                          ACK ACK valid

                                                                          PSH push data now(generally not used)

                                                                          RST SYN FINconnection estab(setup teardown

                                                                          commands)

                                                                          bytes rcvr willingto accept

                                                                          Internetchecksum

                                                                          (as in UDP)

                                                                          countingby bytes of data(not segments)

                                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                                          TCP Flow control how it works

                                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                          LastByteRead]

                                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                          guarantees receive buffer doesnrsquot overflow

                                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                                          Technical Issue

                                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                                          Note on UDP

                                                                          UDP has no flow control

                                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                          Chapter 3 outline

                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                          35 Connection-oriented transport TCP

                                                                          segment structurereliable data transferflow controlconnection management

                                                                          36 Principles of congestion control37 TCP congestion control

                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                          TCP Connection Management

                                                                          Three way handshakeStep 1 client end system sends

                                                                          TCP SYN control segment to server

                                                                          specifies client_isn the initial seq No application data

                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                          TCP Connection Management (cont)

                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                          Allocate buffersAllocates buffersCan include application data

                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                          server

                                                                          Connection granted (SYN=1 server_isn

                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                          ack=client_isn+1)

                                                                          ack=server_isn+1

                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                          TCP Connection Management (cont)

                                                                          Closing a connection

                                                                          client closes socketclientSocketclose()

                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                          client

                                                                          FIN

                                                                          server

                                                                          ACK

                                                                          ACK

                                                                          FIN

                                                                          close

                                                                          close

                                                                          closed

                                                                          tim

                                                                          ed w

                                                                          ait

                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                          TCP Connection Management (cont)

                                                                          Step 3 client receives FIN replies with ACK

                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                          Closes down after timed-wait

                                                                          Step 4 server receives ACK Connection closed

                                                                          Note with small modification can handle simultaneous FINs

                                                                          client

                                                                          FIN

                                                                          server

                                                                          ACK

                                                                          ACK

                                                                          FIN

                                                                          closing

                                                                          closing

                                                                          closed

                                                                          tim

                                                                          ed w

                                                                          ait

                                                                          closed

                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                          TCP Connection Management (cont)

                                                                          ExampleTCP serverlifecycle

                                                                          Example TCP clientlifecycle

                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                          A few special cases

                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                          Chapter 3 outline

                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                          35 Connection-oriented transport TCP

                                                                          segment structurereliable data transferflow controlconnection management

                                                                          36 Principles of congestion control37 TCP congestion control

                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                          Principles of Congestion Control

                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                          a top-10 problem

                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                          large delays when congestedmaximum achievable throughput

                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                          Causescosts of congestion scenario 2

                                                                          one router finite buffers sender retransmission of lost packet

                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                          λin λout=

                                                                          λin λoutgtλ

                                                                          inλout

                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                          (c)(a) (b)

                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                          λin

                                                                          Q what happens as and increase λ

                                                                          in

                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                          Causescosts of congestion scenario 3

                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                          Approaches towards congestion control

                                                                          Two broad approaches towards congestion control

                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                          Case study ATM ABR congestion control

                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                          RM cells returned to sender by receiver with bits intact

                                                                          small exception ndash see next page

                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                          sender should use available bandwidth

                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                          Case study ATM ABR congestion control

                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                          Chapter 3 outline

                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                          35 Connection-oriented transport TCP

                                                                          segment structurereliable data transferflow controlconnection management

                                                                          36 Principles of congestion control37 TCP congestion control

                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                          Congwin

                                                                          w segments each with MSS bytes sent in one RTT

                                                                          throughput = w MSSRTT Bytessec

                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                          LastByteSent-LastByteAcked le CongWin

                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                          cut CongWin in half after loss event

                                                                          8 Kbytes

                                                                          16 Kbytes

                                                                          24 Kbytes

                                                                          time

                                                                          congestionwindow

                                                                          Long-lived TCP connection

                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                          TCP Slow Start

                                                                          When connection begins CongWin = 1 MSS

                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                          available bandwidth may be gtgt MSSRTT

                                                                          desirable to quickly ramp up to respectable rate

                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                          TCP Slow Start (more)

                                                                          When connection begins increase rate exponentially until first loss event

                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                          Host A

                                                                          one segment

                                                                          RTT

                                                                          Host B

                                                                          time

                                                                          two segments

                                                                          four segments

                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                          Summary TCP Congestion Control

                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                          The Big Picture

                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                          ACK receipt for previously unackeddata

                                                                          Slow Start (SS)

                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                          Resulting in a doubling of CongWin every RTT

                                                                          ACK receipt for previously unackeddata

                                                                          CongestionAvoidance (CA)

                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                          Loss event detected by triple duplicate ACK

                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                          Enter slow start

                                                                          Duplicate ACK

                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                          CongWin and Threshold not changed

                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                          TCP throughput

                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                          TCP Futures

                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                          LRTTMSSsdot221

                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                          TCP connection 1

                                                                          bottleneckrouter

                                                                          capacity R

                                                                          TCP connection 2

                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                          Why is TCP fairTwo competing sessions

                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                          R

                                                                          R

                                                                          equal bandwidth share

                                                                          Connection 1 throughput

                                                                          Conn

                                                                          ecti

                                                                          on 2

                                                                          thr

                                                                          ough

                                                                          p ut

                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                          Fairness (more)Fairness and UDP

                                                                          Multimedia apps often do not use TCP

                                                                          do not want rate throttled by congestion control

                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                          Current Research area How to keep UDP from congesting the internet

                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                          TCP Latency ModelingNotation assumptions

                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                          modeling slow start

                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                          Fixed Congestion Window (W)Two cases

                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                          Fixed congestion window (1)

                                                                          First caseWSR gt RTT + SR ACK for

                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                          latency = 2RTT + OR

                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                          Fixed congestion window (2)

                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                          TCP Latency Modeling Slow Start (1)

                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                          Will show that the delay for one object is

                                                                          RS

                                                                          RSRTTP

                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                          ⎤⎢⎣⎡ +++=

                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                          - and K is the number of windows that cover the object

                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                          TCP Latency Modeling Slow Start (2)

                                                                          RTT

                                                                          initiate TCPconnection

                                                                          requestobject

                                                                          first window= SR

                                                                          second window= 2SR

                                                                          third window= 4SR

                                                                          fourth window= 8SR

                                                                          completetransmissionobject

                                                                          delivered

                                                                          time atclient

                                                                          time atserver

                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                          Server idles P=2 times

                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                          Server idles P = minK-1Q times

                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                          TCP Latency Modeling (3)

                                                                          ementacknowledg receivesserver until

                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                          RS

                                                                          RSRTTPRTT

                                                                          RO

                                                                          RSRTT

                                                                          RSRTT

                                                                          RO

                                                                          idleTimeRTTRO

                                                                          P

                                                                          kP

                                                                          k

                                                                          P

                                                                          pp

                                                                          )12(][2

                                                                          ]2[2

                                                                          2delay

                                                                          1

                                                                          1

                                                                          1

                                                                          minusminus+++=

                                                                          minus+++=

                                                                          ++=

                                                                          minus

                                                                          =

                                                                          =

                                                                          sum

                                                                          sum

                                                                          th window after the timeidle 2 1 kRSRTT

                                                                          RS k =⎥⎦

                                                                          ⎤⎢⎣⎡ minus+

                                                                          +minus

                                                                          window kth the transmit totime2 1 =minus

                                                                          RSk

                                                                          RTT

                                                                          initiate TCPconnection

                                                                          requestobject

                                                                          first window= SR

                                                                          second window= 2SR

                                                                          third window= 4SR

                                                                          fourth window= 8SR

                                                                          completetransmissionobject

                                                                          delivered

                                                                          time atclient

                                                                          time atserver

                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                          How do we calculate K

                                                                          ⎥⎥⎤

                                                                          ⎢⎢⎡ +=

                                                                          +ge=

                                                                          geminus=

                                                                          ge+++=

                                                                          ge+++=minus

                                                                          minus

                                                                          )1(log

                                                                          )1(logmin

                                                                          12min

                                                                          222min222min

                                                                          2

                                                                          2

                                                                          110

                                                                          110

                                                                          SO

                                                                          SOkk

                                                                          SOk

                                                                          SOkOSSSkK

                                                                          k

                                                                          k

                                                                          k

                                                                          L

                                                                          L

                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                          HTTP ModelingAssume Web page consists of

                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                          02468

                                                                          101214161820

                                                                          28Kbps

                                                                          100Kbps

                                                                          1 Mbps 10Mbps

                                                                          non-persistent

                                                                          persistent

                                                                          parallel non-persistent

                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                          HTTP Response time (in seconds)

                                                                          0

                                                                          10

                                                                          20

                                                                          30

                                                                          40

                                                                          50

                                                                          60

                                                                          70

                                                                          28Kbps

                                                                          100Kbps

                                                                          1 Mbps 10Mbps

                                                                          non-persistent

                                                                          persistent

                                                                          parallel non-persistent

                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                          instantiation and implementation in the Internet

                                                                          UDPTCP

                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                          • Chapter 3 outline
                                                                          • Transport services and protocols
                                                                          • Transport vs network layer
                                                                          • Transport-layer protocols
                                                                          • Chapter 3 outline
                                                                          • Multiplexingdemultiplexing
                                                                          • Multiplexingdemultiplexing
                                                                          • How demultiplexing works
                                                                          • Connectionless demultiplexing
                                                                          • Connectionless demux (cont)
                                                                          • Connection-oriented demux
                                                                          • Connection-oriented demux (cont)
                                                                          • Connection-oriented demux Threaded Web Server
                                                                          • Chapter 3 outline
                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                          • UDP more
                                                                          • UDP checksum
                                                                          • Chapter 3 outline
                                                                          • Principles of Reliable data transfer
                                                                          • Reliable data transfer getting started
                                                                          • Reliable data transfer getting started
                                                                          • Incremental Improvements
                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                          • Rdt20 channel with bit errors
                                                                          • rdt20 FSM specification
                                                                          • rdt20 operation with no errors
                                                                          • rdt20 error scenario
                                                                          • rdt20 has a fatal flaw
                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                          • rdt21 discussion
                                                                          • rdt22 a NAK-free protocol
                                                                          • rdt22 sender receiver fragments
                                                                          • rdt30 channels with errors and loss
                                                                          • rdt30 sender
                                                                          • rdt30 in action
                                                                          • rdt30 in action
                                                                          • Performance of rdt30
                                                                          • rdt30 stop-and-wait operation
                                                                          • Pipelined protocols
                                                                          • Pipelined protocols
                                                                          • Pipelining increased utilization
                                                                          • Go-Back-N
                                                                          • GBN Sender
                                                                          • GBN sender extended FSM
                                                                          • GBN receiver extended FSM
                                                                          • More on receiver
                                                                          • GBN inaction
                                                                          • Selective Repeat
                                                                          • Selective repeat sender receiver windows
                                                                          • Selective repeat
                                                                          • Selective repeat in action
                                                                          • Selective repeat dilemma
                                                                          • Chapter 3 outline
                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                          • More TCP Details
                                                                          • Even More TCP Details
                                                                          • TCP segment structure
                                                                          • TCP seq rsquos and ACKs
                                                                          • TCP Round Trip Time and Timeout
                                                                          • TCP Round Trip Time and Timeout
                                                                          • Example RTT estimation
                                                                          • TCP Round Trip Time and Timeout
                                                                          • Chapter 3 outline
                                                                          • TCP reliable data transfer
                                                                          • TCP sender events
                                                                          • TCP sender(simplified)
                                                                          • TCP retransmission scenarios
                                                                          • TCP retransmission scenarios (more)
                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                          • More on Sender Policies
                                                                          • Fast Retransmit
                                                                          • Fast retransmit algorithm
                                                                          • TCP GBN or Selective Repeat
                                                                          • Chapter 3 outline
                                                                          • TCP Flow Control
                                                                          • TCP Flow Control
                                                                          • TCP segment structure
                                                                          • TCP Flow control how it works
                                                                          • Technical Issue
                                                                          • Chapter 3 outline
                                                                          • TCP Connection Management
                                                                          • TCP Connection Management (cont)
                                                                          • TCP Connection Management (cont)
                                                                          • TCP Connection Management (cont)
                                                                          • TCP Connection Management (cont)
                                                                          • A few special cases
                                                                          • Chapter 3 outline
                                                                          • Principles of Congestion Control
                                                                          • Causescosts of congestion scenario 1
                                                                          • Causescosts of congestion scenario 2
                                                                          • Causescosts of congestion scenario 3
                                                                          • Causescosts of congestion scenario 3
                                                                          • Approaches towards congestion control
                                                                          • Case study ATM ABR congestion control
                                                                          • Case study ATM ABR congestion control
                                                                          • Chapter 3 outline
                                                                          • TCP Congestion Control
                                                                          • TCP AIMD
                                                                          • TCP Slow Start
                                                                          • TCP Slow Start (more)
                                                                          • Summary TCP Congestion Control
                                                                          • The Big Picture
                                                                          • TCP sender congestion control
                                                                          • TCP throughput
                                                                          • TCP Futures
                                                                          • TCP Fairness
                                                                          • Why is TCP fair
                                                                          • Fairness (more)
                                                                          • TCP Latency Modeling
                                                                          • Fixed Congestion Window (W)
                                                                          • Fixed congestion window (1)
                                                                          • Fixed congestion window (2)
                                                                          • TCP Latency Modeling Slow Start (1)
                                                                          • TCP Latency Modeling Slow Start (2)
                                                                          • TCP Latency Modeling (3)
                                                                          • TCP Latency Modeling (4)
                                                                          • HTTP Modeling
                                                                          • Chapter 3 Summary

                                                                            3 Transport Layer 38Comp 361 Spring 2005

                                                                            rdt30 in action

                                                                            3 Transport Layer 39Comp 361 Spring 2005

                                                                            rdt30 in action

                                                                            3 Transport Layer 40Comp 361 Spring 2005

                                                                            Performance of rdt30

                                                                            rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                            L (packet length in bits)R (transmission rate bps)

                                                                            8kbpkt109 bsec

                                                                            Ttransmit = = = 8 microsec

                                                                            U sender =

                                                                            00830008

                                                                            = 000027 L R RTT + L R

                                                                            =

                                                                            U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                            rdt30 stop-and-wait operation

                                                                            first packet bit transmitted t = 0

                                                                            sender receiver

                                                                            RTT

                                                                            last packet bit transmitted t = L R

                                                                            first packet bit arriveslast packet bit arrives send ACK

                                                                            ACK arrives send next packet t = RTT + L R

                                                                            U sender =

                                                                            008 30008

                                                                            = 000027 L R RTT + L R

                                                                            =

                                                                            3 Transport Layer 41Comp 361 Spring 2005

                                                                            3 Transport Layer 42Comp 361 Spring 2005

                                                                            Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                            range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                            3 Transport Layer 43Comp 361 Spring 2005

                                                                            Pipelined protocols

                                                                            Advantage much better bandwidth utilization than stop-and-wait

                                                                            Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                            Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                            Note TCP is not exactly either

                                                                            Pipelining increased utilization

                                                                            first packet bit transmitted t = 0

                                                                            sender receiver

                                                                            RTT

                                                                            last bit transmitted t = L R

                                                                            first packet bit arriveslast packet bit arrives send ACK

                                                                            ACK arrives send next packet t = RTT + L R

                                                                            last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                            U sender =

                                                                            02430008

                                                                            = 00008 3 L R RTT + L R

                                                                            =

                                                                            Increase utilizationby a factor of 3

                                                                            3 Transport Layer 44Comp 361 Spring 2005

                                                                            3 Transport Layer 45Comp 361 Spring 2005

                                                                            Go-Back-NSender

                                                                            k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                            ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                            Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                            3 Transport Layer 46Comp 361 Spring 2005

                                                                            GBN Sender

                                                                            rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                            Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                            Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                            This is only event that triggers resend

                                                                            3 Transport Layer 47Comp 361 Spring 2005

                                                                            GBN sender extended FSMrdt_send(data)

                                                                            Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                            timeout

                                                                            if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                            start_timernextseqnum++

                                                                            elserefuse_data(data)

                                                                            base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                            stop_timerelse

                                                                            start_timer

                                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                            base=1nextseqnum=1

                                                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                            Λ

                                                                            3 Transport Layer 48Comp 361 Spring 2005

                                                                            GBN receiver extended FSM

                                                                            Wait

                                                                            udt_send(sndpkt)default

                                                                            rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                            expectedseqnum=1sndpkt =

                                                                            make_pkt(0ACKchksum)

                                                                            Λ

                                                                            If expected packet receivedSend ACK and deliver packet upstairs

                                                                            If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                            3 Transport Layer 49Comp 361 Spring 2005

                                                                            More on receiver

                                                                            The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                            3 Transport Layer 50Comp 361 Spring 2005

                                                                            GBN inaction

                                                                            GBN is easy to code but might have performance problems

                                                                            In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                            Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                            3 Transport Layer 51Comp 361 Spring 2005

                                                                            3 Transport Layer 52Comp 361 Spring 2005

                                                                            Selective Repeat

                                                                            receiver individually acknowledges all correctly received pkts

                                                                            buffers pkts as needed for eventual in-order delivery to upper layer

                                                                            sender only resends pkts for which ACK not received

                                                                            sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                            sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                            3 Transport Layer 53Comp 361 Spring 2005

                                                                            Selective repeat sender receiver windows

                                                                            3 Transport Layer 54Comp 361 Spring 2005

                                                                            Selective repeat

                                                                            pkt n in [rcvbase rcvbase+N-1]

                                                                            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                            pkt n in [rcvbase-Nrcvbase-1]

                                                                            ACK(n) (note this is a reACK)

                                                                            otherwiseignore

                                                                            receiverdata from above

                                                                            if next available seq in window send pkt

                                                                            timeout(n)resend pkt n restart timer

                                                                            ACK(n) in [sendbasesendbase+N]

                                                                            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                            sender

                                                                            3 Transport Layer 55Comp 361 Spring 2005

                                                                            Selective repeat in action

                                                                            3 Transport Layer 56Comp 361 Spring 2005

                                                                            Selective repeatdilemma

                                                                            Example seq rsquos 0 1 2 3window size=3

                                                                            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                            Q what is relationship between seq size and window size

                                                                            3 Transport Layer 57Comp 361 Spring 2005

                                                                            Chapter 3 outline

                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                            35 Connection-oriented transport TCP

                                                                            segment structurereliable data transferflow controlconnection management

                                                                            36 Principles of congestion control37 TCP congestion control

                                                                            3 Transport Layer 58Comp 361 Spring 2005

                                                                            TCP Overview RFCs 793 1122 1323 2018 2581

                                                                            full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                            flow controlledsender will not overwhelm receiver

                                                                            point-to-pointone sender one receiver

                                                                            reliable in-order byte steam

                                                                            no ldquomessage boundariesrdquopipelined

                                                                            TCP congestion and flow control set window size

                                                                            send amp receive buffers

                                                                            socketdoor

                                                                            TCPsend buffer

                                                                            TCPreceive buffer

                                                                            socketdoor

                                                                            segment

                                                                            applicationwrites data

                                                                            applicationreads data

                                                                            3 Transport Layer 59Comp 361 Spring 2005

                                                                            More TCP DetailsMaximum Segment Size (MSS)

                                                                            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                            Application Data + TCP Header = TCP Segment

                                                                            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                            (again no payload)Client responds with third special segment

                                                                            This can contain payload

                                                                            3 Transport Layer 60Comp 361 Spring 2005

                                                                            Even More TCP Details

                                                                            A TCP connection between client and server creates in both client and server

                                                                            (i) buffers(ii) variables and

                                                                            (iii) a socket connection to process

                                                                            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                            any of the network elements between the host and server

                                                                            3 Transport Layer 61Comp 361 Spring 2005

                                                                            TCP segment structure

                                                                            source port dest port

                                                                            32 bits

                                                                            applicationdata

                                                                            (variable length)

                                                                            sequence numberacknowledgement number

                                                                            Receive windowUrg data pnterchecksum

                                                                            FSRPAUheadlen

                                                                            notused

                                                                            Options (variable length)

                                                                            URG urgent data (generally not used)

                                                                            ACK ACK valid

                                                                            PSH push data now(generally not used)

                                                                            RST SYN FINconnection estab(setup teardown

                                                                            commands)

                                                                            bytes rcvr willingto accept

                                                                            Internetchecksum

                                                                            (as in UDP)

                                                                            countingby bytes of data(not segments)

                                                                            3 Transport Layer 62Comp 361 Spring 2005

                                                                            TCP seq rsquos and ACKsSeq rsquos

                                                                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                            ACKsseq of next byte expected from other sidecumulative ACK

                                                                            Q how receiver handles out-of-order segments

                                                                            A TCP spec doesnrsquot say - up to implementer

                                                                            Host BHost A

                                                                            Seq=42 ACK=79 data = lsquoCrsquo

                                                                            Seq=79 ACK=43 data = lsquoCrsquo

                                                                            Seq=43 ACK=80

                                                                            Usertypes

                                                                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                            back lsquoCrsquo

                                                                            host ACKsreceipt

                                                                            of echoedlsquoCrsquo

                                                                            timesimple telnet scenario

                                                                            3 Transport Layer 63Comp 361 Spring 2005

                                                                            TCP Round Trip Time and Timeout

                                                                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                            average several recent measurements not just current SampleRTT

                                                                            Q how to set TCP timeout valuelonger than RTT

                                                                            but RTT variestoo short premature timeout

                                                                            unnecessary retransmissions

                                                                            too long slow reaction to segment loss

                                                                            3 Transport Layer 64Comp 361 Spring 2005

                                                                            TCP Round Trip Time and Timeout

                                                                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                            3 Transport Layer 65Comp 361 Spring 2005

                                                                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                            100

                                                                            150

                                                                            200

                                                                            250

                                                                            300

                                                                            350

                                                                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                            time (seconnds)

                                                                            RTT

                                                                            (mill

                                                                            iseco

                                                                            nds)

                                                                            SampleRTT Estimated RTT

                                                                            3 Transport Layer 66Comp 361 Spring 2005

                                                                            TCP Round Trip Time and Timeout

                                                                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                            (typically β = 025)

                                                                            Then set timeout interval

                                                                            TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                            3 Transport Layer 67Comp 361 Spring 2005

                                                                            Chapter 3 outline

                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                            35 Connection-oriented transport TCP

                                                                            segment structurereliable data transferflow controlconnection management

                                                                            36 Principles of congestion control37 TCP congestion control

                                                                            3 Transport Layer 68Comp 361 Spring 2005

                                                                            TCP reliable data transfer

                                                                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                            Retransmissions are triggered by

                                                                            timeout eventsduplicate acks

                                                                            Initially consider simplified TCP sender

                                                                            ignore duplicate acksignore flow control congestion control

                                                                            3 Transport Layer 69Comp 361 Spring 2005

                                                                            TCP sender eventsdata rcvd from app

                                                                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                            timeoutretransmit segment that caused timeoutrestart timer

                                                                            Ack rcvdIf acknowledges previously unackedsegments

                                                                            update what is known to be ackedstart timer if there are outstanding segments

                                                                            TCP sender(simplified)

                                                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                            loop (forever) switch(event)

                                                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                                                            smallest sequence numberstart timer

                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                            start timer

                                                                            end of loop forever

                                                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                            3 Transport Layer 70Comp 361 Spring 2005

                                                                            3 Transport Layer 71Comp 361 Spring 2005

                                                                            TCP retransmission scenariosHost A

                                                                            Seq=100 20 bytes data

                                                                            ACK=100

                                                                            timepremature timeout

                                                                            Host B

                                                                            Seq=92 8 bytes data

                                                                            ACK=120

                                                                            Seq=92 8 bytes data

                                                                            Seq=

                                                                            92 t

                                                                            imeo

                                                                            ut

                                                                            ACK=120

                                                                            Host A

                                                                            Seq=92 8 bytes data

                                                                            ACK=100

                                                                            loss

                                                                            tim

                                                                            eout

                                                                            lost ACK scenario

                                                                            Host B

                                                                            X

                                                                            Seq=92 8 bytes data

                                                                            ACK=100

                                                                            time

                                                                            SendBase= 120

                                                                            SendBase= 120

                                                                            Sendbase= 100

                                                                            Seq=

                                                                            92 t

                                                                            imeo

                                                                            utSendBase

                                                                            = 100

                                                                            3 Transport Layer 72Comp 361 Spring 2005

                                                                            TCP retransmission scenarios (more)Host A

                                                                            Seq=92 8 bytes data

                                                                            ACK=100

                                                                            loss

                                                                            tim

                                                                            eout

                                                                            Cumulative ACK scenario

                                                                            Host B

                                                                            X

                                                                            Seq=100 20 bytes data

                                                                            ACK=120

                                                                            time

                                                                            SendBase= 120

                                                                            3 Transport Layer 73Comp 361 Spring 2005

                                                                            TCP ACK generation [RFC 1122 RFC 2581]

                                                                            Event at Receiver

                                                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                            Arrival of segment that partially or completely fills gap

                                                                            TCP Receiver action

                                                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                            Immediately send single cumulative ACK ACKing both in-order segments

                                                                            Immediately send duplicate ACK indicating seq of next expected byte

                                                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                                                            3 Transport Layer 74Comp 361 Spring 2005

                                                                            More on Sender Policies

                                                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                            3 Transport Layer 75Comp 361 Spring 2005

                                                                            Fast Retransmit

                                                                            Time-out period often relatively long

                                                                            long delay before resending lost packet

                                                                            Detect lost segments via duplicate ACKs

                                                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                            fast retransmit resend segment before timer expires

                                                                            3 Transport Layer 76Comp 361 Spring 2005

                                                                            Fast retransmit algorithm

                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                            start timer

                                                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                            resend segment with sequence number y

                                                                            a duplicate ACK for already ACKed segment

                                                                            fast retransmit

                                                                            3 Transport Layer 77Comp 361 Spring 2005

                                                                            TCP GBN or Selective Repeat

                                                                            Basic TCP looks a lot like GBN

                                                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                            This looks a lot like Selective Repeat

                                                                            TCP is a hybrid

                                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                                            Chapter 3 outline

                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                            35 Connection-oriented transport TCP

                                                                            segment structurereliable data transferflow controlconnection management

                                                                            36 Principles of congestion control37 TCP congestion control

                                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                                            TCP Flow Control

                                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                            transmitting too muchtoo fast

                                                                            flow controlreceive side of TCP connection has a receive buffer

                                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                            app process may be slow at reading from buffer

                                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                                            TCP segment structure

                                                                            source port dest port

                                                                            32 bits

                                                                            applicationdata

                                                                            (variable length)

                                                                            sequence numberacknowledgement number

                                                                            Receive windowUrg data pnterchecksum

                                                                            FSRPAUheadlen

                                                                            notused

                                                                            Options (variable length)

                                                                            URG urgent data (generally not used)

                                                                            ACK ACK valid

                                                                            PSH push data now(generally not used)

                                                                            RST SYN FINconnection estab(setup teardown

                                                                            commands)

                                                                            bytes rcvr willingto accept

                                                                            Internetchecksum

                                                                            (as in UDP)

                                                                            countingby bytes of data(not segments)

                                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                                            TCP Flow control how it works

                                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                            LastByteRead]

                                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                            guarantees receive buffer doesnrsquot overflow

                                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                                            Technical Issue

                                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                                            Note on UDP

                                                                            UDP has no flow control

                                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                                            Chapter 3 outline

                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                            35 Connection-oriented transport TCP

                                                                            segment structurereliable data transferflow controlconnection management

                                                                            36 Principles of congestion control37 TCP congestion control

                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                            TCP Connection Management

                                                                            Three way handshakeStep 1 client end system sends

                                                                            TCP SYN control segment to server

                                                                            specifies client_isn the initial seq No application data

                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                            TCP Connection Management (cont)

                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                            Allocate buffersAllocates buffersCan include application data

                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                            server

                                                                            Connection granted (SYN=1 server_isn

                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                            ack=client_isn+1)

                                                                            ack=server_isn+1

                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                            TCP Connection Management (cont)

                                                                            Closing a connection

                                                                            client closes socketclientSocketclose()

                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                            client

                                                                            FIN

                                                                            server

                                                                            ACK

                                                                            ACK

                                                                            FIN

                                                                            close

                                                                            close

                                                                            closed

                                                                            tim

                                                                            ed w

                                                                            ait

                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                            TCP Connection Management (cont)

                                                                            Step 3 client receives FIN replies with ACK

                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                            Closes down after timed-wait

                                                                            Step 4 server receives ACK Connection closed

                                                                            Note with small modification can handle simultaneous FINs

                                                                            client

                                                                            FIN

                                                                            server

                                                                            ACK

                                                                            ACK

                                                                            FIN

                                                                            closing

                                                                            closing

                                                                            closed

                                                                            tim

                                                                            ed w

                                                                            ait

                                                                            closed

                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                            TCP Connection Management (cont)

                                                                            ExampleTCP serverlifecycle

                                                                            Example TCP clientlifecycle

                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                            A few special cases

                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                            Chapter 3 outline

                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                            35 Connection-oriented transport TCP

                                                                            segment structurereliable data transferflow controlconnection management

                                                                            36 Principles of congestion control37 TCP congestion control

                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                            Principles of Congestion Control

                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                            a top-10 problem

                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                            large delays when congestedmaximum achievable throughput

                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                            Causescosts of congestion scenario 2

                                                                            one router finite buffers sender retransmission of lost packet

                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                            λin λout=

                                                                            λin λoutgtλ

                                                                            inλout

                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                            (c)(a) (b)

                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                            λin

                                                                            Q what happens as and increase λ

                                                                            in

                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                            Causescosts of congestion scenario 3

                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                            Approaches towards congestion control

                                                                            Two broad approaches towards congestion control

                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                            Case study ATM ABR congestion control

                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                            RM cells returned to sender by receiver with bits intact

                                                                            small exception ndash see next page

                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                            sender should use available bandwidth

                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                            Case study ATM ABR congestion control

                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                            Chapter 3 outline

                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                            35 Connection-oriented transport TCP

                                                                            segment structurereliable data transferflow controlconnection management

                                                                            36 Principles of congestion control37 TCP congestion control

                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                            Congwin

                                                                            w segments each with MSS bytes sent in one RTT

                                                                            throughput = w MSSRTT Bytessec

                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                            LastByteSent-LastByteAcked le CongWin

                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                            cut CongWin in half after loss event

                                                                            8 Kbytes

                                                                            16 Kbytes

                                                                            24 Kbytes

                                                                            time

                                                                            congestionwindow

                                                                            Long-lived TCP connection

                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                            TCP Slow Start

                                                                            When connection begins CongWin = 1 MSS

                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                            available bandwidth may be gtgt MSSRTT

                                                                            desirable to quickly ramp up to respectable rate

                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                            TCP Slow Start (more)

                                                                            When connection begins increase rate exponentially until first loss event

                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                            Host A

                                                                            one segment

                                                                            RTT

                                                                            Host B

                                                                            time

                                                                            two segments

                                                                            four segments

                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                            Summary TCP Congestion Control

                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                            The Big Picture

                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                            ACK receipt for previously unackeddata

                                                                            Slow Start (SS)

                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                            Resulting in a doubling of CongWin every RTT

                                                                            ACK receipt for previously unackeddata

                                                                            CongestionAvoidance (CA)

                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                            Loss event detected by triple duplicate ACK

                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                            Enter slow start

                                                                            Duplicate ACK

                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                            CongWin and Threshold not changed

                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                            TCP throughput

                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                            TCP Futures

                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                            LRTTMSSsdot221

                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                            TCP connection 1

                                                                            bottleneckrouter

                                                                            capacity R

                                                                            TCP connection 2

                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                            Why is TCP fairTwo competing sessions

                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                            R

                                                                            R

                                                                            equal bandwidth share

                                                                            Connection 1 throughput

                                                                            Conn

                                                                            ecti

                                                                            on 2

                                                                            thr

                                                                            ough

                                                                            p ut

                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                            Fairness (more)Fairness and UDP

                                                                            Multimedia apps often do not use TCP

                                                                            do not want rate throttled by congestion control

                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                            Current Research area How to keep UDP from congesting the internet

                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                            TCP Latency ModelingNotation assumptions

                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                            modeling slow start

                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                            Fixed Congestion Window (W)Two cases

                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                            Fixed congestion window (1)

                                                                            First caseWSR gt RTT + SR ACK for

                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                            latency = 2RTT + OR

                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                            Fixed congestion window (2)

                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                            TCP Latency Modeling Slow Start (1)

                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                            Will show that the delay for one object is

                                                                            RS

                                                                            RSRTTP

                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                            ⎤⎢⎣⎡ +++=

                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                            - and K is the number of windows that cover the object

                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                            TCP Latency Modeling Slow Start (2)

                                                                            RTT

                                                                            initiate TCPconnection

                                                                            requestobject

                                                                            first window= SR

                                                                            second window= 2SR

                                                                            third window= 4SR

                                                                            fourth window= 8SR

                                                                            completetransmissionobject

                                                                            delivered

                                                                            time atclient

                                                                            time atserver

                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                            Server idles P=2 times

                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                            Server idles P = minK-1Q times

                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                            TCP Latency Modeling (3)

                                                                            ementacknowledg receivesserver until

                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                            RS

                                                                            RSRTTPRTT

                                                                            RO

                                                                            RSRTT

                                                                            RSRTT

                                                                            RO

                                                                            idleTimeRTTRO

                                                                            P

                                                                            kP

                                                                            k

                                                                            P

                                                                            pp

                                                                            )12(][2

                                                                            ]2[2

                                                                            2delay

                                                                            1

                                                                            1

                                                                            1

                                                                            minusminus+++=

                                                                            minus+++=

                                                                            ++=

                                                                            minus

                                                                            =

                                                                            =

                                                                            sum

                                                                            sum

                                                                            th window after the timeidle 2 1 kRSRTT

                                                                            RS k =⎥⎦

                                                                            ⎤⎢⎣⎡ minus+

                                                                            +minus

                                                                            window kth the transmit totime2 1 =minus

                                                                            RSk

                                                                            RTT

                                                                            initiate TCPconnection

                                                                            requestobject

                                                                            first window= SR

                                                                            second window= 2SR

                                                                            third window= 4SR

                                                                            fourth window= 8SR

                                                                            completetransmissionobject

                                                                            delivered

                                                                            time atclient

                                                                            time atserver

                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                            How do we calculate K

                                                                            ⎥⎥⎤

                                                                            ⎢⎢⎡ +=

                                                                            +ge=

                                                                            geminus=

                                                                            ge+++=

                                                                            ge+++=minus

                                                                            minus

                                                                            )1(log

                                                                            )1(logmin

                                                                            12min

                                                                            222min222min

                                                                            2

                                                                            2

                                                                            110

                                                                            110

                                                                            SO

                                                                            SOkk

                                                                            SOk

                                                                            SOkOSSSkK

                                                                            k

                                                                            k

                                                                            k

                                                                            L

                                                                            L

                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                            HTTP ModelingAssume Web page consists of

                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                            02468

                                                                            101214161820

                                                                            28Kbps

                                                                            100Kbps

                                                                            1 Mbps 10Mbps

                                                                            non-persistent

                                                                            persistent

                                                                            parallel non-persistent

                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                            HTTP Response time (in seconds)

                                                                            0

                                                                            10

                                                                            20

                                                                            30

                                                                            40

                                                                            50

                                                                            60

                                                                            70

                                                                            28Kbps

                                                                            100Kbps

                                                                            1 Mbps 10Mbps

                                                                            non-persistent

                                                                            persistent

                                                                            parallel non-persistent

                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                            instantiation and implementation in the Internet

                                                                            UDPTCP

                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                            • Chapter 3 outline
                                                                            • Transport services and protocols
                                                                            • Transport vs network layer
                                                                            • Transport-layer protocols
                                                                            • Chapter 3 outline
                                                                            • Multiplexingdemultiplexing
                                                                            • Multiplexingdemultiplexing
                                                                            • How demultiplexing works
                                                                            • Connectionless demultiplexing
                                                                            • Connectionless demux (cont)
                                                                            • Connection-oriented demux
                                                                            • Connection-oriented demux (cont)
                                                                            • Connection-oriented demux Threaded Web Server
                                                                            • Chapter 3 outline
                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                            • UDP more
                                                                            • UDP checksum
                                                                            • Chapter 3 outline
                                                                            • Principles of Reliable data transfer
                                                                            • Reliable data transfer getting started
                                                                            • Reliable data transfer getting started
                                                                            • Incremental Improvements
                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                            • Rdt20 channel with bit errors
                                                                            • rdt20 FSM specification
                                                                            • rdt20 operation with no errors
                                                                            • rdt20 error scenario
                                                                            • rdt20 has a fatal flaw
                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                            • rdt21 discussion
                                                                            • rdt22 a NAK-free protocol
                                                                            • rdt22 sender receiver fragments
                                                                            • rdt30 channels with errors and loss
                                                                            • rdt30 sender
                                                                            • rdt30 in action
                                                                            • rdt30 in action
                                                                            • Performance of rdt30
                                                                            • rdt30 stop-and-wait operation
                                                                            • Pipelined protocols
                                                                            • Pipelined protocols
                                                                            • Pipelining increased utilization
                                                                            • Go-Back-N
                                                                            • GBN Sender
                                                                            • GBN sender extended FSM
                                                                            • GBN receiver extended FSM
                                                                            • More on receiver
                                                                            • GBN inaction
                                                                            • Selective Repeat
                                                                            • Selective repeat sender receiver windows
                                                                            • Selective repeat
                                                                            • Selective repeat in action
                                                                            • Selective repeat dilemma
                                                                            • Chapter 3 outline
                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                            • More TCP Details
                                                                            • Even More TCP Details
                                                                            • TCP segment structure
                                                                            • TCP seq rsquos and ACKs
                                                                            • TCP Round Trip Time and Timeout
                                                                            • TCP Round Trip Time and Timeout
                                                                            • Example RTT estimation
                                                                            • TCP Round Trip Time and Timeout
                                                                            • Chapter 3 outline
                                                                            • TCP reliable data transfer
                                                                            • TCP sender events
                                                                            • TCP sender(simplified)
                                                                            • TCP retransmission scenarios
                                                                            • TCP retransmission scenarios (more)
                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                            • More on Sender Policies
                                                                            • Fast Retransmit
                                                                            • Fast retransmit algorithm
                                                                            • TCP GBN or Selective Repeat
                                                                            • Chapter 3 outline
                                                                            • TCP Flow Control
                                                                            • TCP Flow Control
                                                                            • TCP segment structure
                                                                            • TCP Flow control how it works
                                                                            • Technical Issue
                                                                            • Chapter 3 outline
                                                                            • TCP Connection Management
                                                                            • TCP Connection Management (cont)
                                                                            • TCP Connection Management (cont)
                                                                            • TCP Connection Management (cont)
                                                                            • TCP Connection Management (cont)
                                                                            • A few special cases
                                                                            • Chapter 3 outline
                                                                            • Principles of Congestion Control
                                                                            • Causescosts of congestion scenario 1
                                                                            • Causescosts of congestion scenario 2
                                                                            • Causescosts of congestion scenario 3
                                                                            • Causescosts of congestion scenario 3
                                                                            • Approaches towards congestion control
                                                                            • Case study ATM ABR congestion control
                                                                            • Case study ATM ABR congestion control
                                                                            • Chapter 3 outline
                                                                            • TCP Congestion Control
                                                                            • TCP AIMD
                                                                            • TCP Slow Start
                                                                            • TCP Slow Start (more)
                                                                            • Summary TCP Congestion Control
                                                                            • The Big Picture
                                                                            • TCP sender congestion control
                                                                            • TCP throughput
                                                                            • TCP Futures
                                                                            • TCP Fairness
                                                                            • Why is TCP fair
                                                                            • Fairness (more)
                                                                            • TCP Latency Modeling
                                                                            • Fixed Congestion Window (W)
                                                                            • Fixed congestion window (1)
                                                                            • Fixed congestion window (2)
                                                                            • TCP Latency Modeling Slow Start (1)
                                                                            • TCP Latency Modeling Slow Start (2)
                                                                            • TCP Latency Modeling (3)
                                                                            • TCP Latency Modeling (4)
                                                                            • HTTP Modeling
                                                                            • Chapter 3 Summary

                                                                              3 Transport Layer 39Comp 361 Spring 2005

                                                                              rdt30 in action

                                                                              3 Transport Layer 40Comp 361 Spring 2005

                                                                              Performance of rdt30

                                                                              rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                              L (packet length in bits)R (transmission rate bps)

                                                                              8kbpkt109 bsec

                                                                              Ttransmit = = = 8 microsec

                                                                              U sender =

                                                                              00830008

                                                                              = 000027 L R RTT + L R

                                                                              =

                                                                              U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                              rdt30 stop-and-wait operation

                                                                              first packet bit transmitted t = 0

                                                                              sender receiver

                                                                              RTT

                                                                              last packet bit transmitted t = L R

                                                                              first packet bit arriveslast packet bit arrives send ACK

                                                                              ACK arrives send next packet t = RTT + L R

                                                                              U sender =

                                                                              008 30008

                                                                              = 000027 L R RTT + L R

                                                                              =

                                                                              3 Transport Layer 41Comp 361 Spring 2005

                                                                              3 Transport Layer 42Comp 361 Spring 2005

                                                                              Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                              range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                              3 Transport Layer 43Comp 361 Spring 2005

                                                                              Pipelined protocols

                                                                              Advantage much better bandwidth utilization than stop-and-wait

                                                                              Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                              Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                              Note TCP is not exactly either

                                                                              Pipelining increased utilization

                                                                              first packet bit transmitted t = 0

                                                                              sender receiver

                                                                              RTT

                                                                              last bit transmitted t = L R

                                                                              first packet bit arriveslast packet bit arrives send ACK

                                                                              ACK arrives send next packet t = RTT + L R

                                                                              last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                              U sender =

                                                                              02430008

                                                                              = 00008 3 L R RTT + L R

                                                                              =

                                                                              Increase utilizationby a factor of 3

                                                                              3 Transport Layer 44Comp 361 Spring 2005

                                                                              3 Transport Layer 45Comp 361 Spring 2005

                                                                              Go-Back-NSender

                                                                              k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                              ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                              Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                              3 Transport Layer 46Comp 361 Spring 2005

                                                                              GBN Sender

                                                                              rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                              Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                              Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                              This is only event that triggers resend

                                                                              3 Transport Layer 47Comp 361 Spring 2005

                                                                              GBN sender extended FSMrdt_send(data)

                                                                              Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                              timeout

                                                                              if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                              start_timernextseqnum++

                                                                              elserefuse_data(data)

                                                                              base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                              stop_timerelse

                                                                              start_timer

                                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                              base=1nextseqnum=1

                                                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                              Λ

                                                                              3 Transport Layer 48Comp 361 Spring 2005

                                                                              GBN receiver extended FSM

                                                                              Wait

                                                                              udt_send(sndpkt)default

                                                                              rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                              expectedseqnum=1sndpkt =

                                                                              make_pkt(0ACKchksum)

                                                                              Λ

                                                                              If expected packet receivedSend ACK and deliver packet upstairs

                                                                              If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                              3 Transport Layer 49Comp 361 Spring 2005

                                                                              More on receiver

                                                                              The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                              3 Transport Layer 50Comp 361 Spring 2005

                                                                              GBN inaction

                                                                              GBN is easy to code but might have performance problems

                                                                              In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                              Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                              3 Transport Layer 51Comp 361 Spring 2005

                                                                              3 Transport Layer 52Comp 361 Spring 2005

                                                                              Selective Repeat

                                                                              receiver individually acknowledges all correctly received pkts

                                                                              buffers pkts as needed for eventual in-order delivery to upper layer

                                                                              sender only resends pkts for which ACK not received

                                                                              sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                              sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                              3 Transport Layer 53Comp 361 Spring 2005

                                                                              Selective repeat sender receiver windows

                                                                              3 Transport Layer 54Comp 361 Spring 2005

                                                                              Selective repeat

                                                                              pkt n in [rcvbase rcvbase+N-1]

                                                                              send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                              pkt n in [rcvbase-Nrcvbase-1]

                                                                              ACK(n) (note this is a reACK)

                                                                              otherwiseignore

                                                                              receiverdata from above

                                                                              if next available seq in window send pkt

                                                                              timeout(n)resend pkt n restart timer

                                                                              ACK(n) in [sendbasesendbase+N]

                                                                              mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                              sender

                                                                              3 Transport Layer 55Comp 361 Spring 2005

                                                                              Selective repeat in action

                                                                              3 Transport Layer 56Comp 361 Spring 2005

                                                                              Selective repeatdilemma

                                                                              Example seq rsquos 0 1 2 3window size=3

                                                                              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                              Q what is relationship between seq size and window size

                                                                              3 Transport Layer 57Comp 361 Spring 2005

                                                                              Chapter 3 outline

                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                              35 Connection-oriented transport TCP

                                                                              segment structurereliable data transferflow controlconnection management

                                                                              36 Principles of congestion control37 TCP congestion control

                                                                              3 Transport Layer 58Comp 361 Spring 2005

                                                                              TCP Overview RFCs 793 1122 1323 2018 2581

                                                                              full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                              flow controlledsender will not overwhelm receiver

                                                                              point-to-pointone sender one receiver

                                                                              reliable in-order byte steam

                                                                              no ldquomessage boundariesrdquopipelined

                                                                              TCP congestion and flow control set window size

                                                                              send amp receive buffers

                                                                              socketdoor

                                                                              TCPsend buffer

                                                                              TCPreceive buffer

                                                                              socketdoor

                                                                              segment

                                                                              applicationwrites data

                                                                              applicationreads data

                                                                              3 Transport Layer 59Comp 361 Spring 2005

                                                                              More TCP DetailsMaximum Segment Size (MSS)

                                                                              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                              Application Data + TCP Header = TCP Segment

                                                                              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                              (again no payload)Client responds with third special segment

                                                                              This can contain payload

                                                                              3 Transport Layer 60Comp 361 Spring 2005

                                                                              Even More TCP Details

                                                                              A TCP connection between client and server creates in both client and server

                                                                              (i) buffers(ii) variables and

                                                                              (iii) a socket connection to process

                                                                              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                              any of the network elements between the host and server

                                                                              3 Transport Layer 61Comp 361 Spring 2005

                                                                              TCP segment structure

                                                                              source port dest port

                                                                              32 bits

                                                                              applicationdata

                                                                              (variable length)

                                                                              sequence numberacknowledgement number

                                                                              Receive windowUrg data pnterchecksum

                                                                              FSRPAUheadlen

                                                                              notused

                                                                              Options (variable length)

                                                                              URG urgent data (generally not used)

                                                                              ACK ACK valid

                                                                              PSH push data now(generally not used)

                                                                              RST SYN FINconnection estab(setup teardown

                                                                              commands)

                                                                              bytes rcvr willingto accept

                                                                              Internetchecksum

                                                                              (as in UDP)

                                                                              countingby bytes of data(not segments)

                                                                              3 Transport Layer 62Comp 361 Spring 2005

                                                                              TCP seq rsquos and ACKsSeq rsquos

                                                                              byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                              ACKsseq of next byte expected from other sidecumulative ACK

                                                                              Q how receiver handles out-of-order segments

                                                                              A TCP spec doesnrsquot say - up to implementer

                                                                              Host BHost A

                                                                              Seq=42 ACK=79 data = lsquoCrsquo

                                                                              Seq=79 ACK=43 data = lsquoCrsquo

                                                                              Seq=43 ACK=80

                                                                              Usertypes

                                                                              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                              back lsquoCrsquo

                                                                              host ACKsreceipt

                                                                              of echoedlsquoCrsquo

                                                                              timesimple telnet scenario

                                                                              3 Transport Layer 63Comp 361 Spring 2005

                                                                              TCP Round Trip Time and Timeout

                                                                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                              average several recent measurements not just current SampleRTT

                                                                              Q how to set TCP timeout valuelonger than RTT

                                                                              but RTT variestoo short premature timeout

                                                                              unnecessary retransmissions

                                                                              too long slow reaction to segment loss

                                                                              3 Transport Layer 64Comp 361 Spring 2005

                                                                              TCP Round Trip Time and Timeout

                                                                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                              3 Transport Layer 65Comp 361 Spring 2005

                                                                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                              100

                                                                              150

                                                                              200

                                                                              250

                                                                              300

                                                                              350

                                                                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                              time (seconnds)

                                                                              RTT

                                                                              (mill

                                                                              iseco

                                                                              nds)

                                                                              SampleRTT Estimated RTT

                                                                              3 Transport Layer 66Comp 361 Spring 2005

                                                                              TCP Round Trip Time and Timeout

                                                                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                              (typically β = 025)

                                                                              Then set timeout interval

                                                                              TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                              3 Transport Layer 67Comp 361 Spring 2005

                                                                              Chapter 3 outline

                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                              35 Connection-oriented transport TCP

                                                                              segment structurereliable data transferflow controlconnection management

                                                                              36 Principles of congestion control37 TCP congestion control

                                                                              3 Transport Layer 68Comp 361 Spring 2005

                                                                              TCP reliable data transfer

                                                                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                              Retransmissions are triggered by

                                                                              timeout eventsduplicate acks

                                                                              Initially consider simplified TCP sender

                                                                              ignore duplicate acksignore flow control congestion control

                                                                              3 Transport Layer 69Comp 361 Spring 2005

                                                                              TCP sender eventsdata rcvd from app

                                                                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                              timeoutretransmit segment that caused timeoutrestart timer

                                                                              Ack rcvdIf acknowledges previously unackedsegments

                                                                              update what is known to be ackedstart timer if there are outstanding segments

                                                                              TCP sender(simplified)

                                                                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                              loop (forever) switch(event)

                                                                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                              event timer timeoutretransmit not-yet-acknowledged segment with

                                                                              smallest sequence numberstart timer

                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                              start timer

                                                                              end of loop forever

                                                                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                              3 Transport Layer 70Comp 361 Spring 2005

                                                                              3 Transport Layer 71Comp 361 Spring 2005

                                                                              TCP retransmission scenariosHost A

                                                                              Seq=100 20 bytes data

                                                                              ACK=100

                                                                              timepremature timeout

                                                                              Host B

                                                                              Seq=92 8 bytes data

                                                                              ACK=120

                                                                              Seq=92 8 bytes data

                                                                              Seq=

                                                                              92 t

                                                                              imeo

                                                                              ut

                                                                              ACK=120

                                                                              Host A

                                                                              Seq=92 8 bytes data

                                                                              ACK=100

                                                                              loss

                                                                              tim

                                                                              eout

                                                                              lost ACK scenario

                                                                              Host B

                                                                              X

                                                                              Seq=92 8 bytes data

                                                                              ACK=100

                                                                              time

                                                                              SendBase= 120

                                                                              SendBase= 120

                                                                              Sendbase= 100

                                                                              Seq=

                                                                              92 t

                                                                              imeo

                                                                              utSendBase

                                                                              = 100

                                                                              3 Transport Layer 72Comp 361 Spring 2005

                                                                              TCP retransmission scenarios (more)Host A

                                                                              Seq=92 8 bytes data

                                                                              ACK=100

                                                                              loss

                                                                              tim

                                                                              eout

                                                                              Cumulative ACK scenario

                                                                              Host B

                                                                              X

                                                                              Seq=100 20 bytes data

                                                                              ACK=120

                                                                              time

                                                                              SendBase= 120

                                                                              3 Transport Layer 73Comp 361 Spring 2005

                                                                              TCP ACK generation [RFC 1122 RFC 2581]

                                                                              Event at Receiver

                                                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                              Arrival of segment that partially or completely fills gap

                                                                              TCP Receiver action

                                                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                              Immediately send single cumulative ACK ACKing both in-order segments

                                                                              Immediately send duplicate ACK indicating seq of next expected byte

                                                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                                                              3 Transport Layer 74Comp 361 Spring 2005

                                                                              More on Sender Policies

                                                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                              3 Transport Layer 75Comp 361 Spring 2005

                                                                              Fast Retransmit

                                                                              Time-out period often relatively long

                                                                              long delay before resending lost packet

                                                                              Detect lost segments via duplicate ACKs

                                                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                              fast retransmit resend segment before timer expires

                                                                              3 Transport Layer 76Comp 361 Spring 2005

                                                                              Fast retransmit algorithm

                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                              start timer

                                                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                              resend segment with sequence number y

                                                                              a duplicate ACK for already ACKed segment

                                                                              fast retransmit

                                                                              3 Transport Layer 77Comp 361 Spring 2005

                                                                              TCP GBN or Selective Repeat

                                                                              Basic TCP looks a lot like GBN

                                                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                              This looks a lot like Selective Repeat

                                                                              TCP is a hybrid

                                                                              3 Transport Layer 78Comp 361 Spring 2005

                                                                              Chapter 3 outline

                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                              35 Connection-oriented transport TCP

                                                                              segment structurereliable data transferflow controlconnection management

                                                                              36 Principles of congestion control37 TCP congestion control

                                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                                              TCP Flow Control

                                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                              transmitting too muchtoo fast

                                                                              flow controlreceive side of TCP connection has a receive buffer

                                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                              app process may be slow at reading from buffer

                                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                                              TCP segment structure

                                                                              source port dest port

                                                                              32 bits

                                                                              applicationdata

                                                                              (variable length)

                                                                              sequence numberacknowledgement number

                                                                              Receive windowUrg data pnterchecksum

                                                                              FSRPAUheadlen

                                                                              notused

                                                                              Options (variable length)

                                                                              URG urgent data (generally not used)

                                                                              ACK ACK valid

                                                                              PSH push data now(generally not used)

                                                                              RST SYN FINconnection estab(setup teardown

                                                                              commands)

                                                                              bytes rcvr willingto accept

                                                                              Internetchecksum

                                                                              (as in UDP)

                                                                              countingby bytes of data(not segments)

                                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                                              TCP Flow control how it works

                                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                              LastByteRead]

                                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                              guarantees receive buffer doesnrsquot overflow

                                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                                              Technical Issue

                                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                                              Note on UDP

                                                                              UDP has no flow control

                                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                                              Chapter 3 outline

                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                              35 Connection-oriented transport TCP

                                                                              segment structurereliable data transferflow controlconnection management

                                                                              36 Principles of congestion control37 TCP congestion control

                                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                                              TCP Connection Management

                                                                              Three way handshakeStep 1 client end system sends

                                                                              TCP SYN control segment to server

                                                                              specifies client_isn the initial seq No application data

                                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                              seq sbuffers flow control info (eg RcvWindow)

                                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                              TCP Connection Management (cont)

                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                              Allocate buffersAllocates buffersCan include application data

                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                              server

                                                                              Connection granted (SYN=1 server_isn

                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                              ack=client_isn+1)

                                                                              ack=server_isn+1

                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                              TCP Connection Management (cont)

                                                                              Closing a connection

                                                                              client closes socketclientSocketclose()

                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                              client

                                                                              FIN

                                                                              server

                                                                              ACK

                                                                              ACK

                                                                              FIN

                                                                              close

                                                                              close

                                                                              closed

                                                                              tim

                                                                              ed w

                                                                              ait

                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                              TCP Connection Management (cont)

                                                                              Step 3 client receives FIN replies with ACK

                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                              Closes down after timed-wait

                                                                              Step 4 server receives ACK Connection closed

                                                                              Note with small modification can handle simultaneous FINs

                                                                              client

                                                                              FIN

                                                                              server

                                                                              ACK

                                                                              ACK

                                                                              FIN

                                                                              closing

                                                                              closing

                                                                              closed

                                                                              tim

                                                                              ed w

                                                                              ait

                                                                              closed

                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                              TCP Connection Management (cont)

                                                                              ExampleTCP serverlifecycle

                                                                              Example TCP clientlifecycle

                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                              A few special cases

                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                              Chapter 3 outline

                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                              35 Connection-oriented transport TCP

                                                                              segment structurereliable data transferflow controlconnection management

                                                                              36 Principles of congestion control37 TCP congestion control

                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                              Principles of Congestion Control

                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                              a top-10 problem

                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                              large delays when congestedmaximum achievable throughput

                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                              Causescosts of congestion scenario 2

                                                                              one router finite buffers sender retransmission of lost packet

                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                              λin λout=

                                                                              λin λoutgtλ

                                                                              inλout

                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                              (c)(a) (b)

                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                              λin

                                                                              Q what happens as and increase λ

                                                                              in

                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                              Causescosts of congestion scenario 3

                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                              Approaches towards congestion control

                                                                              Two broad approaches towards congestion control

                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                              Case study ATM ABR congestion control

                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                              RM cells returned to sender by receiver with bits intact

                                                                              small exception ndash see next page

                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                              sender should use available bandwidth

                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                              Case study ATM ABR congestion control

                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                              Chapter 3 outline

                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                              35 Connection-oriented transport TCP

                                                                              segment structurereliable data transferflow controlconnection management

                                                                              36 Principles of congestion control37 TCP congestion control

                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                              Congwin

                                                                              w segments each with MSS bytes sent in one RTT

                                                                              throughput = w MSSRTT Bytessec

                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                              LastByteSent-LastByteAcked le CongWin

                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                              cut CongWin in half after loss event

                                                                              8 Kbytes

                                                                              16 Kbytes

                                                                              24 Kbytes

                                                                              time

                                                                              congestionwindow

                                                                              Long-lived TCP connection

                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                              TCP Slow Start

                                                                              When connection begins CongWin = 1 MSS

                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                              available bandwidth may be gtgt MSSRTT

                                                                              desirable to quickly ramp up to respectable rate

                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                              TCP Slow Start (more)

                                                                              When connection begins increase rate exponentially until first loss event

                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                              Host A

                                                                              one segment

                                                                              RTT

                                                                              Host B

                                                                              time

                                                                              two segments

                                                                              four segments

                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                              Summary TCP Congestion Control

                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                              The Big Picture

                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                              ACK receipt for previously unackeddata

                                                                              Slow Start (SS)

                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                              Resulting in a doubling of CongWin every RTT

                                                                              ACK receipt for previously unackeddata

                                                                              CongestionAvoidance (CA)

                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                              Loss event detected by triple duplicate ACK

                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                              Enter slow start

                                                                              Duplicate ACK

                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                              CongWin and Threshold not changed

                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                              TCP throughput

                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                              TCP Futures

                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                              LRTTMSSsdot221

                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                              TCP connection 1

                                                                              bottleneckrouter

                                                                              capacity R

                                                                              TCP connection 2

                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                              Why is TCP fairTwo competing sessions

                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                              R

                                                                              R

                                                                              equal bandwidth share

                                                                              Connection 1 throughput

                                                                              Conn

                                                                              ecti

                                                                              on 2

                                                                              thr

                                                                              ough

                                                                              p ut

                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                              Fairness (more)Fairness and UDP

                                                                              Multimedia apps often do not use TCP

                                                                              do not want rate throttled by congestion control

                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                              Current Research area How to keep UDP from congesting the internet

                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                              TCP Latency ModelingNotation assumptions

                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                              modeling slow start

                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                              Fixed Congestion Window (W)Two cases

                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                              Fixed congestion window (1)

                                                                              First caseWSR gt RTT + SR ACK for

                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                              latency = 2RTT + OR

                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                              Fixed congestion window (2)

                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                              TCP Latency Modeling Slow Start (1)

                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                              Will show that the delay for one object is

                                                                              RS

                                                                              RSRTTP

                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                              ⎤⎢⎣⎡ +++=

                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                              - and K is the number of windows that cover the object

                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                              TCP Latency Modeling Slow Start (2)

                                                                              RTT

                                                                              initiate TCPconnection

                                                                              requestobject

                                                                              first window= SR

                                                                              second window= 2SR

                                                                              third window= 4SR

                                                                              fourth window= 8SR

                                                                              completetransmissionobject

                                                                              delivered

                                                                              time atclient

                                                                              time atserver

                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                              Server idles P=2 times

                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                              Server idles P = minK-1Q times

                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                              TCP Latency Modeling (3)

                                                                              ementacknowledg receivesserver until

                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                              RS

                                                                              RSRTTPRTT

                                                                              RO

                                                                              RSRTT

                                                                              RSRTT

                                                                              RO

                                                                              idleTimeRTTRO

                                                                              P

                                                                              kP

                                                                              k

                                                                              P

                                                                              pp

                                                                              )12(][2

                                                                              ]2[2

                                                                              2delay

                                                                              1

                                                                              1

                                                                              1

                                                                              minusminus+++=

                                                                              minus+++=

                                                                              ++=

                                                                              minus

                                                                              =

                                                                              =

                                                                              sum

                                                                              sum

                                                                              th window after the timeidle 2 1 kRSRTT

                                                                              RS k =⎥⎦

                                                                              ⎤⎢⎣⎡ minus+

                                                                              +minus

                                                                              window kth the transmit totime2 1 =minus

                                                                              RSk

                                                                              RTT

                                                                              initiate TCPconnection

                                                                              requestobject

                                                                              first window= SR

                                                                              second window= 2SR

                                                                              third window= 4SR

                                                                              fourth window= 8SR

                                                                              completetransmissionobject

                                                                              delivered

                                                                              time atclient

                                                                              time atserver

                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                              How do we calculate K

                                                                              ⎥⎥⎤

                                                                              ⎢⎢⎡ +=

                                                                              +ge=

                                                                              geminus=

                                                                              ge+++=

                                                                              ge+++=minus

                                                                              minus

                                                                              )1(log

                                                                              )1(logmin

                                                                              12min

                                                                              222min222min

                                                                              2

                                                                              2

                                                                              110

                                                                              110

                                                                              SO

                                                                              SOkk

                                                                              SOk

                                                                              SOkOSSSkK

                                                                              k

                                                                              k

                                                                              k

                                                                              L

                                                                              L

                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                              HTTP ModelingAssume Web page consists of

                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                              02468

                                                                              101214161820

                                                                              28Kbps

                                                                              100Kbps

                                                                              1 Mbps 10Mbps

                                                                              non-persistent

                                                                              persistent

                                                                              parallel non-persistent

                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                              HTTP Response time (in seconds)

                                                                              0

                                                                              10

                                                                              20

                                                                              30

                                                                              40

                                                                              50

                                                                              60

                                                                              70

                                                                              28Kbps

                                                                              100Kbps

                                                                              1 Mbps 10Mbps

                                                                              non-persistent

                                                                              persistent

                                                                              parallel non-persistent

                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                              instantiation and implementation in the Internet

                                                                              UDPTCP

                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                              • Chapter 3 outline
                                                                              • Transport services and protocols
                                                                              • Transport vs network layer
                                                                              • Transport-layer protocols
                                                                              • Chapter 3 outline
                                                                              • Multiplexingdemultiplexing
                                                                              • Multiplexingdemultiplexing
                                                                              • How demultiplexing works
                                                                              • Connectionless demultiplexing
                                                                              • Connectionless demux (cont)
                                                                              • Connection-oriented demux
                                                                              • Connection-oriented demux (cont)
                                                                              • Connection-oriented demux Threaded Web Server
                                                                              • Chapter 3 outline
                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                              • UDP more
                                                                              • UDP checksum
                                                                              • Chapter 3 outline
                                                                              • Principles of Reliable data transfer
                                                                              • Reliable data transfer getting started
                                                                              • Reliable data transfer getting started
                                                                              • Incremental Improvements
                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                              • Rdt20 channel with bit errors
                                                                              • rdt20 FSM specification
                                                                              • rdt20 operation with no errors
                                                                              • rdt20 error scenario
                                                                              • rdt20 has a fatal flaw
                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                              • rdt21 discussion
                                                                              • rdt22 a NAK-free protocol
                                                                              • rdt22 sender receiver fragments
                                                                              • rdt30 channels with errors and loss
                                                                              • rdt30 sender
                                                                              • rdt30 in action
                                                                              • rdt30 in action
                                                                              • Performance of rdt30
                                                                              • rdt30 stop-and-wait operation
                                                                              • Pipelined protocols
                                                                              • Pipelined protocols
                                                                              • Pipelining increased utilization
                                                                              • Go-Back-N
                                                                              • GBN Sender
                                                                              • GBN sender extended FSM
                                                                              • GBN receiver extended FSM
                                                                              • More on receiver
                                                                              • GBN inaction
                                                                              • Selective Repeat
                                                                              • Selective repeat sender receiver windows
                                                                              • Selective repeat
                                                                              • Selective repeat in action
                                                                              • Selective repeat dilemma
                                                                              • Chapter 3 outline
                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                              • More TCP Details
                                                                              • Even More TCP Details
                                                                              • TCP segment structure
                                                                              • TCP seq rsquos and ACKs
                                                                              • TCP Round Trip Time and Timeout
                                                                              • TCP Round Trip Time and Timeout
                                                                              • Example RTT estimation
                                                                              • TCP Round Trip Time and Timeout
                                                                              • Chapter 3 outline
                                                                              • TCP reliable data transfer
                                                                              • TCP sender events
                                                                              • TCP sender(simplified)
                                                                              • TCP retransmission scenarios
                                                                              • TCP retransmission scenarios (more)
                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                              • More on Sender Policies
                                                                              • Fast Retransmit
                                                                              • Fast retransmit algorithm
                                                                              • TCP GBN or Selective Repeat
                                                                              • Chapter 3 outline
                                                                              • TCP Flow Control
                                                                              • TCP Flow Control
                                                                              • TCP segment structure
                                                                              • TCP Flow control how it works
                                                                              • Technical Issue
                                                                              • Chapter 3 outline
                                                                              • TCP Connection Management
                                                                              • TCP Connection Management (cont)
                                                                              • TCP Connection Management (cont)
                                                                              • TCP Connection Management (cont)
                                                                              • TCP Connection Management (cont)
                                                                              • A few special cases
                                                                              • Chapter 3 outline
                                                                              • Principles of Congestion Control
                                                                              • Causescosts of congestion scenario 1
                                                                              • Causescosts of congestion scenario 2
                                                                              • Causescosts of congestion scenario 3
                                                                              • Causescosts of congestion scenario 3
                                                                              • Approaches towards congestion control
                                                                              • Case study ATM ABR congestion control
                                                                              • Case study ATM ABR congestion control
                                                                              • Chapter 3 outline
                                                                              • TCP Congestion Control
                                                                              • TCP AIMD
                                                                              • TCP Slow Start
                                                                              • TCP Slow Start (more)
                                                                              • Summary TCP Congestion Control
                                                                              • The Big Picture
                                                                              • TCP sender congestion control
                                                                              • TCP throughput
                                                                              • TCP Futures
                                                                              • TCP Fairness
                                                                              • Why is TCP fair
                                                                              • Fairness (more)
                                                                              • TCP Latency Modeling
                                                                              • Fixed Congestion Window (W)
                                                                              • Fixed congestion window (1)
                                                                              • Fixed congestion window (2)
                                                                              • TCP Latency Modeling Slow Start (1)
                                                                              • TCP Latency Modeling Slow Start (2)
                                                                              • TCP Latency Modeling (3)
                                                                              • TCP Latency Modeling (4)
                                                                              • HTTP Modeling
                                                                              • Chapter 3 Summary

                                                                                3 Transport Layer 40Comp 361 Spring 2005

                                                                                Performance of rdt30

                                                                                rdt30 works but performance stinksexample 1 Gbps link 15 ms e-e prop delay 1KB packet

                                                                                L (packet length in bits)R (transmission rate bps)

                                                                                8kbpkt109 bsec

                                                                                Ttransmit = = = 8 microsec

                                                                                U sender =

                                                                                00830008

                                                                                = 000027 L R RTT + L R

                                                                                =

                                                                                U sender utilization ndash fraction of time sender busy sending1KB pkt every 30 msec -gt 33kBsec thruput over 1 Gbps linknetwork protocol limits use of physical resources

                                                                                rdt30 stop-and-wait operation

                                                                                first packet bit transmitted t = 0

                                                                                sender receiver

                                                                                RTT

                                                                                last packet bit transmitted t = L R

                                                                                first packet bit arriveslast packet bit arrives send ACK

                                                                                ACK arrives send next packet t = RTT + L R

                                                                                U sender =

                                                                                008 30008

                                                                                = 000027 L R RTT + L R

                                                                                =

                                                                                3 Transport Layer 41Comp 361 Spring 2005

                                                                                3 Transport Layer 42Comp 361 Spring 2005

                                                                                Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                                range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                                3 Transport Layer 43Comp 361 Spring 2005

                                                                                Pipelined protocols

                                                                                Advantage much better bandwidth utilization than stop-and-wait

                                                                                Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                                Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                                Note TCP is not exactly either

                                                                                Pipelining increased utilization

                                                                                first packet bit transmitted t = 0

                                                                                sender receiver

                                                                                RTT

                                                                                last bit transmitted t = L R

                                                                                first packet bit arriveslast packet bit arrives send ACK

                                                                                ACK arrives send next packet t = RTT + L R

                                                                                last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                                U sender =

                                                                                02430008

                                                                                = 00008 3 L R RTT + L R

                                                                                =

                                                                                Increase utilizationby a factor of 3

                                                                                3 Transport Layer 44Comp 361 Spring 2005

                                                                                3 Transport Layer 45Comp 361 Spring 2005

                                                                                Go-Back-NSender

                                                                                k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                                ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                                Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                                3 Transport Layer 46Comp 361 Spring 2005

                                                                                GBN Sender

                                                                                rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                This is only event that triggers resend

                                                                                3 Transport Layer 47Comp 361 Spring 2005

                                                                                GBN sender extended FSMrdt_send(data)

                                                                                Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                timeout

                                                                                if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                start_timernextseqnum++

                                                                                elserefuse_data(data)

                                                                                base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                stop_timerelse

                                                                                start_timer

                                                                                rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                base=1nextseqnum=1

                                                                                rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                Λ

                                                                                3 Transport Layer 48Comp 361 Spring 2005

                                                                                GBN receiver extended FSM

                                                                                Wait

                                                                                udt_send(sndpkt)default

                                                                                rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                expectedseqnum=1sndpkt =

                                                                                make_pkt(0ACKchksum)

                                                                                Λ

                                                                                If expected packet receivedSend ACK and deliver packet upstairs

                                                                                If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                3 Transport Layer 49Comp 361 Spring 2005

                                                                                More on receiver

                                                                                The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                3 Transport Layer 50Comp 361 Spring 2005

                                                                                GBN inaction

                                                                                GBN is easy to code but might have performance problems

                                                                                In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                3 Transport Layer 51Comp 361 Spring 2005

                                                                                3 Transport Layer 52Comp 361 Spring 2005

                                                                                Selective Repeat

                                                                                receiver individually acknowledges all correctly received pkts

                                                                                buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                sender only resends pkts for which ACK not received

                                                                                sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                3 Transport Layer 53Comp 361 Spring 2005

                                                                                Selective repeat sender receiver windows

                                                                                3 Transport Layer 54Comp 361 Spring 2005

                                                                                Selective repeat

                                                                                pkt n in [rcvbase rcvbase+N-1]

                                                                                send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                pkt n in [rcvbase-Nrcvbase-1]

                                                                                ACK(n) (note this is a reACK)

                                                                                otherwiseignore

                                                                                receiverdata from above

                                                                                if next available seq in window send pkt

                                                                                timeout(n)resend pkt n restart timer

                                                                                ACK(n) in [sendbasesendbase+N]

                                                                                mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                sender

                                                                                3 Transport Layer 55Comp 361 Spring 2005

                                                                                Selective repeat in action

                                                                                3 Transport Layer 56Comp 361 Spring 2005

                                                                                Selective repeatdilemma

                                                                                Example seq rsquos 0 1 2 3window size=3

                                                                                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                Q what is relationship between seq size and window size

                                                                                3 Transport Layer 57Comp 361 Spring 2005

                                                                                Chapter 3 outline

                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                35 Connection-oriented transport TCP

                                                                                segment structurereliable data transferflow controlconnection management

                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                3 Transport Layer 58Comp 361 Spring 2005

                                                                                TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                flow controlledsender will not overwhelm receiver

                                                                                point-to-pointone sender one receiver

                                                                                reliable in-order byte steam

                                                                                no ldquomessage boundariesrdquopipelined

                                                                                TCP congestion and flow control set window size

                                                                                send amp receive buffers

                                                                                socketdoor

                                                                                TCPsend buffer

                                                                                TCPreceive buffer

                                                                                socketdoor

                                                                                segment

                                                                                applicationwrites data

                                                                                applicationreads data

                                                                                3 Transport Layer 59Comp 361 Spring 2005

                                                                                More TCP DetailsMaximum Segment Size (MSS)

                                                                                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                Application Data + TCP Header = TCP Segment

                                                                                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                (again no payload)Client responds with third special segment

                                                                                This can contain payload

                                                                                3 Transport Layer 60Comp 361 Spring 2005

                                                                                Even More TCP Details

                                                                                A TCP connection between client and server creates in both client and server

                                                                                (i) buffers(ii) variables and

                                                                                (iii) a socket connection to process

                                                                                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                any of the network elements between the host and server

                                                                                3 Transport Layer 61Comp 361 Spring 2005

                                                                                TCP segment structure

                                                                                source port dest port

                                                                                32 bits

                                                                                applicationdata

                                                                                (variable length)

                                                                                sequence numberacknowledgement number

                                                                                Receive windowUrg data pnterchecksum

                                                                                FSRPAUheadlen

                                                                                notused

                                                                                Options (variable length)

                                                                                URG urgent data (generally not used)

                                                                                ACK ACK valid

                                                                                PSH push data now(generally not used)

                                                                                RST SYN FINconnection estab(setup teardown

                                                                                commands)

                                                                                bytes rcvr willingto accept

                                                                                Internetchecksum

                                                                                (as in UDP)

                                                                                countingby bytes of data(not segments)

                                                                                3 Transport Layer 62Comp 361 Spring 2005

                                                                                TCP seq rsquos and ACKsSeq rsquos

                                                                                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                ACKsseq of next byte expected from other sidecumulative ACK

                                                                                Q how receiver handles out-of-order segments

                                                                                A TCP spec doesnrsquot say - up to implementer

                                                                                Host BHost A

                                                                                Seq=42 ACK=79 data = lsquoCrsquo

                                                                                Seq=79 ACK=43 data = lsquoCrsquo

                                                                                Seq=43 ACK=80

                                                                                Usertypes

                                                                                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                back lsquoCrsquo

                                                                                host ACKsreceipt

                                                                                of echoedlsquoCrsquo

                                                                                timesimple telnet scenario

                                                                                3 Transport Layer 63Comp 361 Spring 2005

                                                                                TCP Round Trip Time and Timeout

                                                                                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                average several recent measurements not just current SampleRTT

                                                                                Q how to set TCP timeout valuelonger than RTT

                                                                                but RTT variestoo short premature timeout

                                                                                unnecessary retransmissions

                                                                                too long slow reaction to segment loss

                                                                                3 Transport Layer 64Comp 361 Spring 2005

                                                                                TCP Round Trip Time and Timeout

                                                                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                3 Transport Layer 65Comp 361 Spring 2005

                                                                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                100

                                                                                150

                                                                                200

                                                                                250

                                                                                300

                                                                                350

                                                                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                time (seconnds)

                                                                                RTT

                                                                                (mill

                                                                                iseco

                                                                                nds)

                                                                                SampleRTT Estimated RTT

                                                                                3 Transport Layer 66Comp 361 Spring 2005

                                                                                TCP Round Trip Time and Timeout

                                                                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                (typically β = 025)

                                                                                Then set timeout interval

                                                                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                3 Transport Layer 67Comp 361 Spring 2005

                                                                                Chapter 3 outline

                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                35 Connection-oriented transport TCP

                                                                                segment structurereliable data transferflow controlconnection management

                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                3 Transport Layer 68Comp 361 Spring 2005

                                                                                TCP reliable data transfer

                                                                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                Retransmissions are triggered by

                                                                                timeout eventsduplicate acks

                                                                                Initially consider simplified TCP sender

                                                                                ignore duplicate acksignore flow control congestion control

                                                                                3 Transport Layer 69Comp 361 Spring 2005

                                                                                TCP sender eventsdata rcvd from app

                                                                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                timeoutretransmit segment that caused timeoutrestart timer

                                                                                Ack rcvdIf acknowledges previously unackedsegments

                                                                                update what is known to be ackedstart timer if there are outstanding segments

                                                                                TCP sender(simplified)

                                                                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                loop (forever) switch(event)

                                                                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                smallest sequence numberstart timer

                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                start timer

                                                                                end of loop forever

                                                                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                3 Transport Layer 70Comp 361 Spring 2005

                                                                                3 Transport Layer 71Comp 361 Spring 2005

                                                                                TCP retransmission scenariosHost A

                                                                                Seq=100 20 bytes data

                                                                                ACK=100

                                                                                timepremature timeout

                                                                                Host B

                                                                                Seq=92 8 bytes data

                                                                                ACK=120

                                                                                Seq=92 8 bytes data

                                                                                Seq=

                                                                                92 t

                                                                                imeo

                                                                                ut

                                                                                ACK=120

                                                                                Host A

                                                                                Seq=92 8 bytes data

                                                                                ACK=100

                                                                                loss

                                                                                tim

                                                                                eout

                                                                                lost ACK scenario

                                                                                Host B

                                                                                X

                                                                                Seq=92 8 bytes data

                                                                                ACK=100

                                                                                time

                                                                                SendBase= 120

                                                                                SendBase= 120

                                                                                Sendbase= 100

                                                                                Seq=

                                                                                92 t

                                                                                imeo

                                                                                utSendBase

                                                                                = 100

                                                                                3 Transport Layer 72Comp 361 Spring 2005

                                                                                TCP retransmission scenarios (more)Host A

                                                                                Seq=92 8 bytes data

                                                                                ACK=100

                                                                                loss

                                                                                tim

                                                                                eout

                                                                                Cumulative ACK scenario

                                                                                Host B

                                                                                X

                                                                                Seq=100 20 bytes data

                                                                                ACK=120

                                                                                time

                                                                                SendBase= 120

                                                                                3 Transport Layer 73Comp 361 Spring 2005

                                                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                                                Event at Receiver

                                                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                Arrival of segment that partially or completely fills gap

                                                                                TCP Receiver action

                                                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                3 Transport Layer 74Comp 361 Spring 2005

                                                                                More on Sender Policies

                                                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                3 Transport Layer 75Comp 361 Spring 2005

                                                                                Fast Retransmit

                                                                                Time-out period often relatively long

                                                                                long delay before resending lost packet

                                                                                Detect lost segments via duplicate ACKs

                                                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                fast retransmit resend segment before timer expires

                                                                                3 Transport Layer 76Comp 361 Spring 2005

                                                                                Fast retransmit algorithm

                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                start timer

                                                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                resend segment with sequence number y

                                                                                a duplicate ACK for already ACKed segment

                                                                                fast retransmit

                                                                                3 Transport Layer 77Comp 361 Spring 2005

                                                                                TCP GBN or Selective Repeat

                                                                                Basic TCP looks a lot like GBN

                                                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                This looks a lot like Selective Repeat

                                                                                TCP is a hybrid

                                                                                3 Transport Layer 78Comp 361 Spring 2005

                                                                                Chapter 3 outline

                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                35 Connection-oriented transport TCP

                                                                                segment structurereliable data transferflow controlconnection management

                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                3 Transport Layer 79Comp 361 Spring 2005

                                                                                TCP Flow Control

                                                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                transmitting too muchtoo fast

                                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                app process may be slow at reading from buffer

                                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                                TCP segment structure

                                                                                source port dest port

                                                                                32 bits

                                                                                applicationdata

                                                                                (variable length)

                                                                                sequence numberacknowledgement number

                                                                                Receive windowUrg data pnterchecksum

                                                                                FSRPAUheadlen

                                                                                notused

                                                                                Options (variable length)

                                                                                URG urgent data (generally not used)

                                                                                ACK ACK valid

                                                                                PSH push data now(generally not used)

                                                                                RST SYN FINconnection estab(setup teardown

                                                                                commands)

                                                                                bytes rcvr willingto accept

                                                                                Internetchecksum

                                                                                (as in UDP)

                                                                                countingby bytes of data(not segments)

                                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                                TCP Flow control how it works

                                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                LastByteRead]

                                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                guarantees receive buffer doesnrsquot overflow

                                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                                Technical Issue

                                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                                Note on UDP

                                                                                UDP has no flow control

                                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                                Chapter 3 outline

                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                35 Connection-oriented transport TCP

                                                                                segment structurereliable data transferflow controlconnection management

                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                                TCP Connection Management

                                                                                Three way handshakeStep 1 client end system sends

                                                                                TCP SYN control segment to server

                                                                                specifies client_isn the initial seq No application data

                                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                                TCP Connection Management (cont)

                                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                Allocate buffersAllocates buffersCan include application data

                                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                                server

                                                                                Connection granted (SYN=1 server_isn

                                                                                ACK (SYN=0 seq=client_isn+1)

                                                                                ack=client_isn+1)

                                                                                ack=server_isn+1

                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                TCP Connection Management (cont)

                                                                                Closing a connection

                                                                                client closes socketclientSocketclose()

                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                client

                                                                                FIN

                                                                                server

                                                                                ACK

                                                                                ACK

                                                                                FIN

                                                                                close

                                                                                close

                                                                                closed

                                                                                tim

                                                                                ed w

                                                                                ait

                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                TCP Connection Management (cont)

                                                                                Step 3 client receives FIN replies with ACK

                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                Closes down after timed-wait

                                                                                Step 4 server receives ACK Connection closed

                                                                                Note with small modification can handle simultaneous FINs

                                                                                client

                                                                                FIN

                                                                                server

                                                                                ACK

                                                                                ACK

                                                                                FIN

                                                                                closing

                                                                                closing

                                                                                closed

                                                                                tim

                                                                                ed w

                                                                                ait

                                                                                closed

                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                TCP Connection Management (cont)

                                                                                ExampleTCP serverlifecycle

                                                                                Example TCP clientlifecycle

                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                A few special cases

                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                Chapter 3 outline

                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                35 Connection-oriented transport TCP

                                                                                segment structurereliable data transferflow controlconnection management

                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                Principles of Congestion Control

                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                a top-10 problem

                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                large delays when congestedmaximum achievable throughput

                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                Causescosts of congestion scenario 2

                                                                                one router finite buffers sender retransmission of lost packet

                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                λin λout=

                                                                                λin λoutgtλ

                                                                                inλout

                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                (c)(a) (b)

                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                λin

                                                                                Q what happens as and increase λ

                                                                                in

                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                Causescosts of congestion scenario 3

                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                Approaches towards congestion control

                                                                                Two broad approaches towards congestion control

                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                Case study ATM ABR congestion control

                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                RM cells returned to sender by receiver with bits intact

                                                                                small exception ndash see next page

                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                sender should use available bandwidth

                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                Case study ATM ABR congestion control

                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                Chapter 3 outline

                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                35 Connection-oriented transport TCP

                                                                                segment structurereliable data transferflow controlconnection management

                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                Congwin

                                                                                w segments each with MSS bytes sent in one RTT

                                                                                throughput = w MSSRTT Bytessec

                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                cut CongWin in half after loss event

                                                                                8 Kbytes

                                                                                16 Kbytes

                                                                                24 Kbytes

                                                                                time

                                                                                congestionwindow

                                                                                Long-lived TCP connection

                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                TCP Slow Start

                                                                                When connection begins CongWin = 1 MSS

                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                available bandwidth may be gtgt MSSRTT

                                                                                desirable to quickly ramp up to respectable rate

                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                TCP Slow Start (more)

                                                                                When connection begins increase rate exponentially until first loss event

                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                Host A

                                                                                one segment

                                                                                RTT

                                                                                Host B

                                                                                time

                                                                                two segments

                                                                                four segments

                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                Summary TCP Congestion Control

                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                The Big Picture

                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                ACK receipt for previously unackeddata

                                                                                Slow Start (SS)

                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                Resulting in a doubling of CongWin every RTT

                                                                                ACK receipt for previously unackeddata

                                                                                CongestionAvoidance (CA)

                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                Loss event detected by triple duplicate ACK

                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                Enter slow start

                                                                                Duplicate ACK

                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                CongWin and Threshold not changed

                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                TCP throughput

                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                TCP Futures

                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                LRTTMSSsdot221

                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                TCP connection 1

                                                                                bottleneckrouter

                                                                                capacity R

                                                                                TCP connection 2

                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                Why is TCP fairTwo competing sessions

                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                R

                                                                                R

                                                                                equal bandwidth share

                                                                                Connection 1 throughput

                                                                                Conn

                                                                                ecti

                                                                                on 2

                                                                                thr

                                                                                ough

                                                                                p ut

                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                Fairness (more)Fairness and UDP

                                                                                Multimedia apps often do not use TCP

                                                                                do not want rate throttled by congestion control

                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                TCP Latency ModelingNotation assumptions

                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                modeling slow start

                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                Fixed Congestion Window (W)Two cases

                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                Fixed congestion window (1)

                                                                                First caseWSR gt RTT + SR ACK for

                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                latency = 2RTT + OR

                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                Fixed congestion window (2)

                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                TCP Latency Modeling Slow Start (1)

                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                Will show that the delay for one object is

                                                                                RS

                                                                                RSRTTP

                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                ⎤⎢⎣⎡ +++=

                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                - and K is the number of windows that cover the object

                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                TCP Latency Modeling Slow Start (2)

                                                                                RTT

                                                                                initiate TCPconnection

                                                                                requestobject

                                                                                first window= SR

                                                                                second window= 2SR

                                                                                third window= 4SR

                                                                                fourth window= 8SR

                                                                                completetransmissionobject

                                                                                delivered

                                                                                time atclient

                                                                                time atserver

                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                Server idles P=2 times

                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                Server idles P = minK-1Q times

                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                TCP Latency Modeling (3)

                                                                                ementacknowledg receivesserver until

                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                RS

                                                                                RSRTTPRTT

                                                                                RO

                                                                                RSRTT

                                                                                RSRTT

                                                                                RO

                                                                                idleTimeRTTRO

                                                                                P

                                                                                kP

                                                                                k

                                                                                P

                                                                                pp

                                                                                )12(][2

                                                                                ]2[2

                                                                                2delay

                                                                                1

                                                                                1

                                                                                1

                                                                                minusminus+++=

                                                                                minus+++=

                                                                                ++=

                                                                                minus

                                                                                =

                                                                                =

                                                                                sum

                                                                                sum

                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                RS k =⎥⎦

                                                                                ⎤⎢⎣⎡ minus+

                                                                                +minus

                                                                                window kth the transmit totime2 1 =minus

                                                                                RSk

                                                                                RTT

                                                                                initiate TCPconnection

                                                                                requestobject

                                                                                first window= SR

                                                                                second window= 2SR

                                                                                third window= 4SR

                                                                                fourth window= 8SR

                                                                                completetransmissionobject

                                                                                delivered

                                                                                time atclient

                                                                                time atserver

                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                How do we calculate K

                                                                                ⎥⎥⎤

                                                                                ⎢⎢⎡ +=

                                                                                +ge=

                                                                                geminus=

                                                                                ge+++=

                                                                                ge+++=minus

                                                                                minus

                                                                                )1(log

                                                                                )1(logmin

                                                                                12min

                                                                                222min222min

                                                                                2

                                                                                2

                                                                                110

                                                                                110

                                                                                SO

                                                                                SOkk

                                                                                SOk

                                                                                SOkOSSSkK

                                                                                k

                                                                                k

                                                                                k

                                                                                L

                                                                                L

                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                HTTP ModelingAssume Web page consists of

                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                02468

                                                                                101214161820

                                                                                28Kbps

                                                                                100Kbps

                                                                                1 Mbps 10Mbps

                                                                                non-persistent

                                                                                persistent

                                                                                parallel non-persistent

                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                HTTP Response time (in seconds)

                                                                                0

                                                                                10

                                                                                20

                                                                                30

                                                                                40

                                                                                50

                                                                                60

                                                                                70

                                                                                28Kbps

                                                                                100Kbps

                                                                                1 Mbps 10Mbps

                                                                                non-persistent

                                                                                persistent

                                                                                parallel non-persistent

                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                instantiation and implementation in the Internet

                                                                                UDPTCP

                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                • Chapter 3 outline
                                                                                • Transport services and protocols
                                                                                • Transport vs network layer
                                                                                • Transport-layer protocols
                                                                                • Chapter 3 outline
                                                                                • Multiplexingdemultiplexing
                                                                                • Multiplexingdemultiplexing
                                                                                • How demultiplexing works
                                                                                • Connectionless demultiplexing
                                                                                • Connectionless demux (cont)
                                                                                • Connection-oriented demux
                                                                                • Connection-oriented demux (cont)
                                                                                • Connection-oriented demux Threaded Web Server
                                                                                • Chapter 3 outline
                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                • UDP more
                                                                                • UDP checksum
                                                                                • Chapter 3 outline
                                                                                • Principles of Reliable data transfer
                                                                                • Reliable data transfer getting started
                                                                                • Reliable data transfer getting started
                                                                                • Incremental Improvements
                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                • Rdt20 channel with bit errors
                                                                                • rdt20 FSM specification
                                                                                • rdt20 operation with no errors
                                                                                • rdt20 error scenario
                                                                                • rdt20 has a fatal flaw
                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                • rdt21 discussion
                                                                                • rdt22 a NAK-free protocol
                                                                                • rdt22 sender receiver fragments
                                                                                • rdt30 channels with errors and loss
                                                                                • rdt30 sender
                                                                                • rdt30 in action
                                                                                • rdt30 in action
                                                                                • Performance of rdt30
                                                                                • rdt30 stop-and-wait operation
                                                                                • Pipelined protocols
                                                                                • Pipelined protocols
                                                                                • Pipelining increased utilization
                                                                                • Go-Back-N
                                                                                • GBN Sender
                                                                                • GBN sender extended FSM
                                                                                • GBN receiver extended FSM
                                                                                • More on receiver
                                                                                • GBN inaction
                                                                                • Selective Repeat
                                                                                • Selective repeat sender receiver windows
                                                                                • Selective repeat
                                                                                • Selective repeat in action
                                                                                • Selective repeat dilemma
                                                                                • Chapter 3 outline
                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                • More TCP Details
                                                                                • Even More TCP Details
                                                                                • TCP segment structure
                                                                                • TCP seq rsquos and ACKs
                                                                                • TCP Round Trip Time and Timeout
                                                                                • TCP Round Trip Time and Timeout
                                                                                • Example RTT estimation
                                                                                • TCP Round Trip Time and Timeout
                                                                                • Chapter 3 outline
                                                                                • TCP reliable data transfer
                                                                                • TCP sender events
                                                                                • TCP sender(simplified)
                                                                                • TCP retransmission scenarios
                                                                                • TCP retransmission scenarios (more)
                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                • More on Sender Policies
                                                                                • Fast Retransmit
                                                                                • Fast retransmit algorithm
                                                                                • TCP GBN or Selective Repeat
                                                                                • Chapter 3 outline
                                                                                • TCP Flow Control
                                                                                • TCP Flow Control
                                                                                • TCP segment structure
                                                                                • TCP Flow control how it works
                                                                                • Technical Issue
                                                                                • Chapter 3 outline
                                                                                • TCP Connection Management
                                                                                • TCP Connection Management (cont)
                                                                                • TCP Connection Management (cont)
                                                                                • TCP Connection Management (cont)
                                                                                • TCP Connection Management (cont)
                                                                                • A few special cases
                                                                                • Chapter 3 outline
                                                                                • Principles of Congestion Control
                                                                                • Causescosts of congestion scenario 1
                                                                                • Causescosts of congestion scenario 2
                                                                                • Causescosts of congestion scenario 3
                                                                                • Causescosts of congestion scenario 3
                                                                                • Approaches towards congestion control
                                                                                • Case study ATM ABR congestion control
                                                                                • Case study ATM ABR congestion control
                                                                                • Chapter 3 outline
                                                                                • TCP Congestion Control
                                                                                • TCP AIMD
                                                                                • TCP Slow Start
                                                                                • TCP Slow Start (more)
                                                                                • Summary TCP Congestion Control
                                                                                • The Big Picture
                                                                                • TCP sender congestion control
                                                                                • TCP throughput
                                                                                • TCP Futures
                                                                                • TCP Fairness
                                                                                • Why is TCP fair
                                                                                • Fairness (more)
                                                                                • TCP Latency Modeling
                                                                                • Fixed Congestion Window (W)
                                                                                • Fixed congestion window (1)
                                                                                • Fixed congestion window (2)
                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                • TCP Latency Modeling (3)
                                                                                • TCP Latency Modeling (4)
                                                                                • HTTP Modeling
                                                                                • Chapter 3 Summary

                                                                                  rdt30 stop-and-wait operation

                                                                                  first packet bit transmitted t = 0

                                                                                  sender receiver

                                                                                  RTT

                                                                                  last packet bit transmitted t = L R

                                                                                  first packet bit arriveslast packet bit arrives send ACK

                                                                                  ACK arrives send next packet t = RTT + L R

                                                                                  U sender =

                                                                                  008 30008

                                                                                  = 000027 L R RTT + L R

                                                                                  =

                                                                                  3 Transport Layer 41Comp 361 Spring 2005

                                                                                  3 Transport Layer 42Comp 361 Spring 2005

                                                                                  Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                                  range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                                  3 Transport Layer 43Comp 361 Spring 2005

                                                                                  Pipelined protocols

                                                                                  Advantage much better bandwidth utilization than stop-and-wait

                                                                                  Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                                  Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                                  Note TCP is not exactly either

                                                                                  Pipelining increased utilization

                                                                                  first packet bit transmitted t = 0

                                                                                  sender receiver

                                                                                  RTT

                                                                                  last bit transmitted t = L R

                                                                                  first packet bit arriveslast packet bit arrives send ACK

                                                                                  ACK arrives send next packet t = RTT + L R

                                                                                  last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                                  U sender =

                                                                                  02430008

                                                                                  = 00008 3 L R RTT + L R

                                                                                  =

                                                                                  Increase utilizationby a factor of 3

                                                                                  3 Transport Layer 44Comp 361 Spring 2005

                                                                                  3 Transport Layer 45Comp 361 Spring 2005

                                                                                  Go-Back-NSender

                                                                                  k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                                  ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                                  Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                                  3 Transport Layer 46Comp 361 Spring 2005

                                                                                  GBN Sender

                                                                                  rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                  Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                  Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                  This is only event that triggers resend

                                                                                  3 Transport Layer 47Comp 361 Spring 2005

                                                                                  GBN sender extended FSMrdt_send(data)

                                                                                  Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                  timeout

                                                                                  if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                  start_timernextseqnum++

                                                                                  elserefuse_data(data)

                                                                                  base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                  stop_timerelse

                                                                                  start_timer

                                                                                  rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                  base=1nextseqnum=1

                                                                                  rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                  Λ

                                                                                  3 Transport Layer 48Comp 361 Spring 2005

                                                                                  GBN receiver extended FSM

                                                                                  Wait

                                                                                  udt_send(sndpkt)default

                                                                                  rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                  extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                  expectedseqnum=1sndpkt =

                                                                                  make_pkt(0ACKchksum)

                                                                                  Λ

                                                                                  If expected packet receivedSend ACK and deliver packet upstairs

                                                                                  If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                  3 Transport Layer 49Comp 361 Spring 2005

                                                                                  More on receiver

                                                                                  The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                  3 Transport Layer 50Comp 361 Spring 2005

                                                                                  GBN inaction

                                                                                  GBN is easy to code but might have performance problems

                                                                                  In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                  Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                  3 Transport Layer 51Comp 361 Spring 2005

                                                                                  3 Transport Layer 52Comp 361 Spring 2005

                                                                                  Selective Repeat

                                                                                  receiver individually acknowledges all correctly received pkts

                                                                                  buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                  sender only resends pkts for which ACK not received

                                                                                  sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                  sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                  3 Transport Layer 53Comp 361 Spring 2005

                                                                                  Selective repeat sender receiver windows

                                                                                  3 Transport Layer 54Comp 361 Spring 2005

                                                                                  Selective repeat

                                                                                  pkt n in [rcvbase rcvbase+N-1]

                                                                                  send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                  pkt n in [rcvbase-Nrcvbase-1]

                                                                                  ACK(n) (note this is a reACK)

                                                                                  otherwiseignore

                                                                                  receiverdata from above

                                                                                  if next available seq in window send pkt

                                                                                  timeout(n)resend pkt n restart timer

                                                                                  ACK(n) in [sendbasesendbase+N]

                                                                                  mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                  sender

                                                                                  3 Transport Layer 55Comp 361 Spring 2005

                                                                                  Selective repeat in action

                                                                                  3 Transport Layer 56Comp 361 Spring 2005

                                                                                  Selective repeatdilemma

                                                                                  Example seq rsquos 0 1 2 3window size=3

                                                                                  receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                  Q what is relationship between seq size and window size

                                                                                  3 Transport Layer 57Comp 361 Spring 2005

                                                                                  Chapter 3 outline

                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                  35 Connection-oriented transport TCP

                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                  3 Transport Layer 58Comp 361 Spring 2005

                                                                                  TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                  flow controlledsender will not overwhelm receiver

                                                                                  point-to-pointone sender one receiver

                                                                                  reliable in-order byte steam

                                                                                  no ldquomessage boundariesrdquopipelined

                                                                                  TCP congestion and flow control set window size

                                                                                  send amp receive buffers

                                                                                  socketdoor

                                                                                  TCPsend buffer

                                                                                  TCPreceive buffer

                                                                                  socketdoor

                                                                                  segment

                                                                                  applicationwrites data

                                                                                  applicationreads data

                                                                                  3 Transport Layer 59Comp 361 Spring 2005

                                                                                  More TCP DetailsMaximum Segment Size (MSS)

                                                                                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                  Application Data + TCP Header = TCP Segment

                                                                                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                  (again no payload)Client responds with third special segment

                                                                                  This can contain payload

                                                                                  3 Transport Layer 60Comp 361 Spring 2005

                                                                                  Even More TCP Details

                                                                                  A TCP connection between client and server creates in both client and server

                                                                                  (i) buffers(ii) variables and

                                                                                  (iii) a socket connection to process

                                                                                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                  any of the network elements between the host and server

                                                                                  3 Transport Layer 61Comp 361 Spring 2005

                                                                                  TCP segment structure

                                                                                  source port dest port

                                                                                  32 bits

                                                                                  applicationdata

                                                                                  (variable length)

                                                                                  sequence numberacknowledgement number

                                                                                  Receive windowUrg data pnterchecksum

                                                                                  FSRPAUheadlen

                                                                                  notused

                                                                                  Options (variable length)

                                                                                  URG urgent data (generally not used)

                                                                                  ACK ACK valid

                                                                                  PSH push data now(generally not used)

                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                  commands)

                                                                                  bytes rcvr willingto accept

                                                                                  Internetchecksum

                                                                                  (as in UDP)

                                                                                  countingby bytes of data(not segments)

                                                                                  3 Transport Layer 62Comp 361 Spring 2005

                                                                                  TCP seq rsquos and ACKsSeq rsquos

                                                                                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                  ACKsseq of next byte expected from other sidecumulative ACK

                                                                                  Q how receiver handles out-of-order segments

                                                                                  A TCP spec doesnrsquot say - up to implementer

                                                                                  Host BHost A

                                                                                  Seq=42 ACK=79 data = lsquoCrsquo

                                                                                  Seq=79 ACK=43 data = lsquoCrsquo

                                                                                  Seq=43 ACK=80

                                                                                  Usertypes

                                                                                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                  back lsquoCrsquo

                                                                                  host ACKsreceipt

                                                                                  of echoedlsquoCrsquo

                                                                                  timesimple telnet scenario

                                                                                  3 Transport Layer 63Comp 361 Spring 2005

                                                                                  TCP Round Trip Time and Timeout

                                                                                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                  average several recent measurements not just current SampleRTT

                                                                                  Q how to set TCP timeout valuelonger than RTT

                                                                                  but RTT variestoo short premature timeout

                                                                                  unnecessary retransmissions

                                                                                  too long slow reaction to segment loss

                                                                                  3 Transport Layer 64Comp 361 Spring 2005

                                                                                  TCP Round Trip Time and Timeout

                                                                                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                  3 Transport Layer 65Comp 361 Spring 2005

                                                                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                  100

                                                                                  150

                                                                                  200

                                                                                  250

                                                                                  300

                                                                                  350

                                                                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                  time (seconnds)

                                                                                  RTT

                                                                                  (mill

                                                                                  iseco

                                                                                  nds)

                                                                                  SampleRTT Estimated RTT

                                                                                  3 Transport Layer 66Comp 361 Spring 2005

                                                                                  TCP Round Trip Time and Timeout

                                                                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                  (typically β = 025)

                                                                                  Then set timeout interval

                                                                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                  3 Transport Layer 67Comp 361 Spring 2005

                                                                                  Chapter 3 outline

                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                  35 Connection-oriented transport TCP

                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                  3 Transport Layer 68Comp 361 Spring 2005

                                                                                  TCP reliable data transfer

                                                                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                  Retransmissions are triggered by

                                                                                  timeout eventsduplicate acks

                                                                                  Initially consider simplified TCP sender

                                                                                  ignore duplicate acksignore flow control congestion control

                                                                                  3 Transport Layer 69Comp 361 Spring 2005

                                                                                  TCP sender eventsdata rcvd from app

                                                                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                  timeoutretransmit segment that caused timeoutrestart timer

                                                                                  Ack rcvdIf acknowledges previously unackedsegments

                                                                                  update what is known to be ackedstart timer if there are outstanding segments

                                                                                  TCP sender(simplified)

                                                                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                  loop (forever) switch(event)

                                                                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                  smallest sequence numberstart timer

                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                  start timer

                                                                                  end of loop forever

                                                                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                  3 Transport Layer 70Comp 361 Spring 2005

                                                                                  3 Transport Layer 71Comp 361 Spring 2005

                                                                                  TCP retransmission scenariosHost A

                                                                                  Seq=100 20 bytes data

                                                                                  ACK=100

                                                                                  timepremature timeout

                                                                                  Host B

                                                                                  Seq=92 8 bytes data

                                                                                  ACK=120

                                                                                  Seq=92 8 bytes data

                                                                                  Seq=

                                                                                  92 t

                                                                                  imeo

                                                                                  ut

                                                                                  ACK=120

                                                                                  Host A

                                                                                  Seq=92 8 bytes data

                                                                                  ACK=100

                                                                                  loss

                                                                                  tim

                                                                                  eout

                                                                                  lost ACK scenario

                                                                                  Host B

                                                                                  X

                                                                                  Seq=92 8 bytes data

                                                                                  ACK=100

                                                                                  time

                                                                                  SendBase= 120

                                                                                  SendBase= 120

                                                                                  Sendbase= 100

                                                                                  Seq=

                                                                                  92 t

                                                                                  imeo

                                                                                  utSendBase

                                                                                  = 100

                                                                                  3 Transport Layer 72Comp 361 Spring 2005

                                                                                  TCP retransmission scenarios (more)Host A

                                                                                  Seq=92 8 bytes data

                                                                                  ACK=100

                                                                                  loss

                                                                                  tim

                                                                                  eout

                                                                                  Cumulative ACK scenario

                                                                                  Host B

                                                                                  X

                                                                                  Seq=100 20 bytes data

                                                                                  ACK=120

                                                                                  time

                                                                                  SendBase= 120

                                                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                                                  Event at Receiver

                                                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                  Arrival of segment that partially or completely fills gap

                                                                                  TCP Receiver action

                                                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                                                  More on Sender Policies

                                                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                                                  Fast Retransmit

                                                                                  Time-out period often relatively long

                                                                                  long delay before resending lost packet

                                                                                  Detect lost segments via duplicate ACKs

                                                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                  fast retransmit resend segment before timer expires

                                                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                                                  Fast retransmit algorithm

                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                  start timer

                                                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                  resend segment with sequence number y

                                                                                  a duplicate ACK for already ACKed segment

                                                                                  fast retransmit

                                                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                                                  TCP GBN or Selective Repeat

                                                                                  Basic TCP looks a lot like GBN

                                                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                  This looks a lot like Selective Repeat

                                                                                  TCP is a hybrid

                                                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                                                  Chapter 3 outline

                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                  35 Connection-oriented transport TCP

                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                                                  TCP Flow Control

                                                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                  transmitting too muchtoo fast

                                                                                  flow controlreceive side of TCP connection has a receive buffer

                                                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                  app process may be slow at reading from buffer

                                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                                  TCP segment structure

                                                                                  source port dest port

                                                                                  32 bits

                                                                                  applicationdata

                                                                                  (variable length)

                                                                                  sequence numberacknowledgement number

                                                                                  Receive windowUrg data pnterchecksum

                                                                                  FSRPAUheadlen

                                                                                  notused

                                                                                  Options (variable length)

                                                                                  URG urgent data (generally not used)

                                                                                  ACK ACK valid

                                                                                  PSH push data now(generally not used)

                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                  commands)

                                                                                  bytes rcvr willingto accept

                                                                                  Internetchecksum

                                                                                  (as in UDP)

                                                                                  countingby bytes of data(not segments)

                                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                                  TCP Flow control how it works

                                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                  LastByteRead]

                                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                  guarantees receive buffer doesnrsquot overflow

                                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                                  Technical Issue

                                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                                  Note on UDP

                                                                                  UDP has no flow control

                                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                                  Chapter 3 outline

                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                  35 Connection-oriented transport TCP

                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                                  TCP Connection Management

                                                                                  Three way handshakeStep 1 client end system sends

                                                                                  TCP SYN control segment to server

                                                                                  specifies client_isn the initial seq No application data

                                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                                  TCP Connection Management (cont)

                                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                  Allocate buffersAllocates buffersCan include application data

                                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                                  server

                                                                                  Connection granted (SYN=1 server_isn

                                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                                  ack=client_isn+1)

                                                                                  ack=server_isn+1

                                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                                  TCP Connection Management (cont)

                                                                                  Closing a connection

                                                                                  client closes socketclientSocketclose()

                                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                  client

                                                                                  FIN

                                                                                  server

                                                                                  ACK

                                                                                  ACK

                                                                                  FIN

                                                                                  close

                                                                                  close

                                                                                  closed

                                                                                  tim

                                                                                  ed w

                                                                                  ait

                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                  TCP Connection Management (cont)

                                                                                  Step 3 client receives FIN replies with ACK

                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                  Closes down after timed-wait

                                                                                  Step 4 server receives ACK Connection closed

                                                                                  Note with small modification can handle simultaneous FINs

                                                                                  client

                                                                                  FIN

                                                                                  server

                                                                                  ACK

                                                                                  ACK

                                                                                  FIN

                                                                                  closing

                                                                                  closing

                                                                                  closed

                                                                                  tim

                                                                                  ed w

                                                                                  ait

                                                                                  closed

                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                  TCP Connection Management (cont)

                                                                                  ExampleTCP serverlifecycle

                                                                                  Example TCP clientlifecycle

                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                  A few special cases

                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                  Chapter 3 outline

                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                  35 Connection-oriented transport TCP

                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                  Principles of Congestion Control

                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                  a top-10 problem

                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                  large delays when congestedmaximum achievable throughput

                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                  Causescosts of congestion scenario 2

                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                  λin λout=

                                                                                  λin λoutgtλ

                                                                                  inλout

                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                  (c)(a) (b)

                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                  λin

                                                                                  Q what happens as and increase λ

                                                                                  in

                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                  Causescosts of congestion scenario 3

                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                  Approaches towards congestion control

                                                                                  Two broad approaches towards congestion control

                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                  Case study ATM ABR congestion control

                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                  small exception ndash see next page

                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                  sender should use available bandwidth

                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                  Case study ATM ABR congestion control

                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                  Chapter 3 outline

                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                  35 Connection-oriented transport TCP

                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                  Congwin

                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                  throughput = w MSSRTT Bytessec

                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                  cut CongWin in half after loss event

                                                                                  8 Kbytes

                                                                                  16 Kbytes

                                                                                  24 Kbytes

                                                                                  time

                                                                                  congestionwindow

                                                                                  Long-lived TCP connection

                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                  TCP Slow Start

                                                                                  When connection begins CongWin = 1 MSS

                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                  desirable to quickly ramp up to respectable rate

                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                  TCP Slow Start (more)

                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                  Host A

                                                                                  one segment

                                                                                  RTT

                                                                                  Host B

                                                                                  time

                                                                                  two segments

                                                                                  four segments

                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                  Summary TCP Congestion Control

                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                  The Big Picture

                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                  ACK receipt for previously unackeddata

                                                                                  Slow Start (SS)

                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                  ACK receipt for previously unackeddata

                                                                                  CongestionAvoidance (CA)

                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                  Loss event detected by triple duplicate ACK

                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                  Enter slow start

                                                                                  Duplicate ACK

                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                  CongWin and Threshold not changed

                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                  TCP throughput

                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                  TCP Futures

                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                  LRTTMSSsdot221

                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                  TCP connection 1

                                                                                  bottleneckrouter

                                                                                  capacity R

                                                                                  TCP connection 2

                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                  Why is TCP fairTwo competing sessions

                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                  R

                                                                                  R

                                                                                  equal bandwidth share

                                                                                  Connection 1 throughput

                                                                                  Conn

                                                                                  ecti

                                                                                  on 2

                                                                                  thr

                                                                                  ough

                                                                                  p ut

                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                  Fairness (more)Fairness and UDP

                                                                                  Multimedia apps often do not use TCP

                                                                                  do not want rate throttled by congestion control

                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                  TCP Latency ModelingNotation assumptions

                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                  modeling slow start

                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                  Fixed Congestion Window (W)Two cases

                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                  Fixed congestion window (1)

                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                  latency = 2RTT + OR

                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                  Fixed congestion window (2)

                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                  Will show that the delay for one object is

                                                                                  RS

                                                                                  RSRTTP

                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                  ⎤⎢⎣⎡ +++=

                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                  - and K is the number of windows that cover the object

                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                  RTT

                                                                                  initiate TCPconnection

                                                                                  requestobject

                                                                                  first window= SR

                                                                                  second window= 2SR

                                                                                  third window= 4SR

                                                                                  fourth window= 8SR

                                                                                  completetransmissionobject

                                                                                  delivered

                                                                                  time atclient

                                                                                  time atserver

                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                  Server idles P=2 times

                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                  Server idles P = minK-1Q times

                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                  TCP Latency Modeling (3)

                                                                                  ementacknowledg receivesserver until

                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                  RS

                                                                                  RSRTTPRTT

                                                                                  RO

                                                                                  RSRTT

                                                                                  RSRTT

                                                                                  RO

                                                                                  idleTimeRTTRO

                                                                                  P

                                                                                  kP

                                                                                  k

                                                                                  P

                                                                                  pp

                                                                                  )12(][2

                                                                                  ]2[2

                                                                                  2delay

                                                                                  1

                                                                                  1

                                                                                  1

                                                                                  minusminus+++=

                                                                                  minus+++=

                                                                                  ++=

                                                                                  minus

                                                                                  =

                                                                                  =

                                                                                  sum

                                                                                  sum

                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                  RS k =⎥⎦

                                                                                  ⎤⎢⎣⎡ minus+

                                                                                  +minus

                                                                                  window kth the transmit totime2 1 =minus

                                                                                  RSk

                                                                                  RTT

                                                                                  initiate TCPconnection

                                                                                  requestobject

                                                                                  first window= SR

                                                                                  second window= 2SR

                                                                                  third window= 4SR

                                                                                  fourth window= 8SR

                                                                                  completetransmissionobject

                                                                                  delivered

                                                                                  time atclient

                                                                                  time atserver

                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                  How do we calculate K

                                                                                  ⎥⎥⎤

                                                                                  ⎢⎢⎡ +=

                                                                                  +ge=

                                                                                  geminus=

                                                                                  ge+++=

                                                                                  ge+++=minus

                                                                                  minus

                                                                                  )1(log

                                                                                  )1(logmin

                                                                                  12min

                                                                                  222min222min

                                                                                  2

                                                                                  2

                                                                                  110

                                                                                  110

                                                                                  SO

                                                                                  SOkk

                                                                                  SOk

                                                                                  SOkOSSSkK

                                                                                  k

                                                                                  k

                                                                                  k

                                                                                  L

                                                                                  L

                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                  HTTP ModelingAssume Web page consists of

                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                  02468

                                                                                  101214161820

                                                                                  28Kbps

                                                                                  100Kbps

                                                                                  1 Mbps 10Mbps

                                                                                  non-persistent

                                                                                  persistent

                                                                                  parallel non-persistent

                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                  HTTP Response time (in seconds)

                                                                                  0

                                                                                  10

                                                                                  20

                                                                                  30

                                                                                  40

                                                                                  50

                                                                                  60

                                                                                  70

                                                                                  28Kbps

                                                                                  100Kbps

                                                                                  1 Mbps 10Mbps

                                                                                  non-persistent

                                                                                  persistent

                                                                                  parallel non-persistent

                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                  instantiation and implementation in the Internet

                                                                                  UDPTCP

                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                  • Chapter 3 outline
                                                                                  • Transport services and protocols
                                                                                  • Transport vs network layer
                                                                                  • Transport-layer protocols
                                                                                  • Chapter 3 outline
                                                                                  • Multiplexingdemultiplexing
                                                                                  • Multiplexingdemultiplexing
                                                                                  • How demultiplexing works
                                                                                  • Connectionless demultiplexing
                                                                                  • Connectionless demux (cont)
                                                                                  • Connection-oriented demux
                                                                                  • Connection-oriented demux (cont)
                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                  • Chapter 3 outline
                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                  • UDP more
                                                                                  • UDP checksum
                                                                                  • Chapter 3 outline
                                                                                  • Principles of Reliable data transfer
                                                                                  • Reliable data transfer getting started
                                                                                  • Reliable data transfer getting started
                                                                                  • Incremental Improvements
                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                  • Rdt20 channel with bit errors
                                                                                  • rdt20 FSM specification
                                                                                  • rdt20 operation with no errors
                                                                                  • rdt20 error scenario
                                                                                  • rdt20 has a fatal flaw
                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                  • rdt21 discussion
                                                                                  • rdt22 a NAK-free protocol
                                                                                  • rdt22 sender receiver fragments
                                                                                  • rdt30 channels with errors and loss
                                                                                  • rdt30 sender
                                                                                  • rdt30 in action
                                                                                  • rdt30 in action
                                                                                  • Performance of rdt30
                                                                                  • rdt30 stop-and-wait operation
                                                                                  • Pipelined protocols
                                                                                  • Pipelined protocols
                                                                                  • Pipelining increased utilization
                                                                                  • Go-Back-N
                                                                                  • GBN Sender
                                                                                  • GBN sender extended FSM
                                                                                  • GBN receiver extended FSM
                                                                                  • More on receiver
                                                                                  • GBN inaction
                                                                                  • Selective Repeat
                                                                                  • Selective repeat sender receiver windows
                                                                                  • Selective repeat
                                                                                  • Selective repeat in action
                                                                                  • Selective repeat dilemma
                                                                                  • Chapter 3 outline
                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                  • More TCP Details
                                                                                  • Even More TCP Details
                                                                                  • TCP segment structure
                                                                                  • TCP seq rsquos and ACKs
                                                                                  • TCP Round Trip Time and Timeout
                                                                                  • TCP Round Trip Time and Timeout
                                                                                  • Example RTT estimation
                                                                                  • TCP Round Trip Time and Timeout
                                                                                  • Chapter 3 outline
                                                                                  • TCP reliable data transfer
                                                                                  • TCP sender events
                                                                                  • TCP sender(simplified)
                                                                                  • TCP retransmission scenarios
                                                                                  • TCP retransmission scenarios (more)
                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                  • More on Sender Policies
                                                                                  • Fast Retransmit
                                                                                  • Fast retransmit algorithm
                                                                                  • TCP GBN or Selective Repeat
                                                                                  • Chapter 3 outline
                                                                                  • TCP Flow Control
                                                                                  • TCP Flow Control
                                                                                  • TCP segment structure
                                                                                  • TCP Flow control how it works
                                                                                  • Technical Issue
                                                                                  • Chapter 3 outline
                                                                                  • TCP Connection Management
                                                                                  • TCP Connection Management (cont)
                                                                                  • TCP Connection Management (cont)
                                                                                  • TCP Connection Management (cont)
                                                                                  • TCP Connection Management (cont)
                                                                                  • A few special cases
                                                                                  • Chapter 3 outline
                                                                                  • Principles of Congestion Control
                                                                                  • Causescosts of congestion scenario 1
                                                                                  • Causescosts of congestion scenario 2
                                                                                  • Causescosts of congestion scenario 3
                                                                                  • Causescosts of congestion scenario 3
                                                                                  • Approaches towards congestion control
                                                                                  • Case study ATM ABR congestion control
                                                                                  • Case study ATM ABR congestion control
                                                                                  • Chapter 3 outline
                                                                                  • TCP Congestion Control
                                                                                  • TCP AIMD
                                                                                  • TCP Slow Start
                                                                                  • TCP Slow Start (more)
                                                                                  • Summary TCP Congestion Control
                                                                                  • The Big Picture
                                                                                  • TCP sender congestion control
                                                                                  • TCP throughput
                                                                                  • TCP Futures
                                                                                  • TCP Fairness
                                                                                  • Why is TCP fair
                                                                                  • Fairness (more)
                                                                                  • TCP Latency Modeling
                                                                                  • Fixed Congestion Window (W)
                                                                                  • Fixed congestion window (1)
                                                                                  • Fixed congestion window (2)
                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                  • TCP Latency Modeling (3)
                                                                                  • TCP Latency Modeling (4)
                                                                                  • HTTP Modeling
                                                                                  • Chapter 3 Summary

                                                                                    3 Transport Layer 42Comp 361 Spring 2005

                                                                                    Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-acknowledged pkts

                                                                                    range of sequence numbers must be increasedbuffering at sender andor receiver

                                                                                    3 Transport Layer 43Comp 361 Spring 2005

                                                                                    Pipelined protocols

                                                                                    Advantage much better bandwidth utilization than stop-and-wait

                                                                                    Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                                    Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                                    Note TCP is not exactly either

                                                                                    Pipelining increased utilization

                                                                                    first packet bit transmitted t = 0

                                                                                    sender receiver

                                                                                    RTT

                                                                                    last bit transmitted t = L R

                                                                                    first packet bit arriveslast packet bit arrives send ACK

                                                                                    ACK arrives send next packet t = RTT + L R

                                                                                    last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                                    U sender =

                                                                                    02430008

                                                                                    = 00008 3 L R RTT + L R

                                                                                    =

                                                                                    Increase utilizationby a factor of 3

                                                                                    3 Transport Layer 44Comp 361 Spring 2005

                                                                                    3 Transport Layer 45Comp 361 Spring 2005

                                                                                    Go-Back-NSender

                                                                                    k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                                    ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                                    Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                                    3 Transport Layer 46Comp 361 Spring 2005

                                                                                    GBN Sender

                                                                                    rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                    Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                    Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                    This is only event that triggers resend

                                                                                    3 Transport Layer 47Comp 361 Spring 2005

                                                                                    GBN sender extended FSMrdt_send(data)

                                                                                    Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                    timeout

                                                                                    if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                    start_timernextseqnum++

                                                                                    elserefuse_data(data)

                                                                                    base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                    stop_timerelse

                                                                                    start_timer

                                                                                    rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                    base=1nextseqnum=1

                                                                                    rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                    Λ

                                                                                    3 Transport Layer 48Comp 361 Spring 2005

                                                                                    GBN receiver extended FSM

                                                                                    Wait

                                                                                    udt_send(sndpkt)default

                                                                                    rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                    extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                    expectedseqnum=1sndpkt =

                                                                                    make_pkt(0ACKchksum)

                                                                                    Λ

                                                                                    If expected packet receivedSend ACK and deliver packet upstairs

                                                                                    If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                    3 Transport Layer 49Comp 361 Spring 2005

                                                                                    More on receiver

                                                                                    The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                    3 Transport Layer 50Comp 361 Spring 2005

                                                                                    GBN inaction

                                                                                    GBN is easy to code but might have performance problems

                                                                                    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                    3 Transport Layer 51Comp 361 Spring 2005

                                                                                    3 Transport Layer 52Comp 361 Spring 2005

                                                                                    Selective Repeat

                                                                                    receiver individually acknowledges all correctly received pkts

                                                                                    buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                    sender only resends pkts for which ACK not received

                                                                                    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                    3 Transport Layer 53Comp 361 Spring 2005

                                                                                    Selective repeat sender receiver windows

                                                                                    3 Transport Layer 54Comp 361 Spring 2005

                                                                                    Selective repeat

                                                                                    pkt n in [rcvbase rcvbase+N-1]

                                                                                    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                    pkt n in [rcvbase-Nrcvbase-1]

                                                                                    ACK(n) (note this is a reACK)

                                                                                    otherwiseignore

                                                                                    receiverdata from above

                                                                                    if next available seq in window send pkt

                                                                                    timeout(n)resend pkt n restart timer

                                                                                    ACK(n) in [sendbasesendbase+N]

                                                                                    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                    sender

                                                                                    3 Transport Layer 55Comp 361 Spring 2005

                                                                                    Selective repeat in action

                                                                                    3 Transport Layer 56Comp 361 Spring 2005

                                                                                    Selective repeatdilemma

                                                                                    Example seq rsquos 0 1 2 3window size=3

                                                                                    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                    Q what is relationship between seq size and window size

                                                                                    3 Transport Layer 57Comp 361 Spring 2005

                                                                                    Chapter 3 outline

                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                    35 Connection-oriented transport TCP

                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                    3 Transport Layer 58Comp 361 Spring 2005

                                                                                    TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                    flow controlledsender will not overwhelm receiver

                                                                                    point-to-pointone sender one receiver

                                                                                    reliable in-order byte steam

                                                                                    no ldquomessage boundariesrdquopipelined

                                                                                    TCP congestion and flow control set window size

                                                                                    send amp receive buffers

                                                                                    socketdoor

                                                                                    TCPsend buffer

                                                                                    TCPreceive buffer

                                                                                    socketdoor

                                                                                    segment

                                                                                    applicationwrites data

                                                                                    applicationreads data

                                                                                    3 Transport Layer 59Comp 361 Spring 2005

                                                                                    More TCP DetailsMaximum Segment Size (MSS)

                                                                                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                    Application Data + TCP Header = TCP Segment

                                                                                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                    (again no payload)Client responds with third special segment

                                                                                    This can contain payload

                                                                                    3 Transport Layer 60Comp 361 Spring 2005

                                                                                    Even More TCP Details

                                                                                    A TCP connection between client and server creates in both client and server

                                                                                    (i) buffers(ii) variables and

                                                                                    (iii) a socket connection to process

                                                                                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                    any of the network elements between the host and server

                                                                                    3 Transport Layer 61Comp 361 Spring 2005

                                                                                    TCP segment structure

                                                                                    source port dest port

                                                                                    32 bits

                                                                                    applicationdata

                                                                                    (variable length)

                                                                                    sequence numberacknowledgement number

                                                                                    Receive windowUrg data pnterchecksum

                                                                                    FSRPAUheadlen

                                                                                    notused

                                                                                    Options (variable length)

                                                                                    URG urgent data (generally not used)

                                                                                    ACK ACK valid

                                                                                    PSH push data now(generally not used)

                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                    commands)

                                                                                    bytes rcvr willingto accept

                                                                                    Internetchecksum

                                                                                    (as in UDP)

                                                                                    countingby bytes of data(not segments)

                                                                                    3 Transport Layer 62Comp 361 Spring 2005

                                                                                    TCP seq rsquos and ACKsSeq rsquos

                                                                                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                    ACKsseq of next byte expected from other sidecumulative ACK

                                                                                    Q how receiver handles out-of-order segments

                                                                                    A TCP spec doesnrsquot say - up to implementer

                                                                                    Host BHost A

                                                                                    Seq=42 ACK=79 data = lsquoCrsquo

                                                                                    Seq=79 ACK=43 data = lsquoCrsquo

                                                                                    Seq=43 ACK=80

                                                                                    Usertypes

                                                                                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                    back lsquoCrsquo

                                                                                    host ACKsreceipt

                                                                                    of echoedlsquoCrsquo

                                                                                    timesimple telnet scenario

                                                                                    3 Transport Layer 63Comp 361 Spring 2005

                                                                                    TCP Round Trip Time and Timeout

                                                                                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                    average several recent measurements not just current SampleRTT

                                                                                    Q how to set TCP timeout valuelonger than RTT

                                                                                    but RTT variestoo short premature timeout

                                                                                    unnecessary retransmissions

                                                                                    too long slow reaction to segment loss

                                                                                    3 Transport Layer 64Comp 361 Spring 2005

                                                                                    TCP Round Trip Time and Timeout

                                                                                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                    3 Transport Layer 65Comp 361 Spring 2005

                                                                                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                    100

                                                                                    150

                                                                                    200

                                                                                    250

                                                                                    300

                                                                                    350

                                                                                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                    time (seconnds)

                                                                                    RTT

                                                                                    (mill

                                                                                    iseco

                                                                                    nds)

                                                                                    SampleRTT Estimated RTT

                                                                                    3 Transport Layer 66Comp 361 Spring 2005

                                                                                    TCP Round Trip Time and Timeout

                                                                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                    (typically β = 025)

                                                                                    Then set timeout interval

                                                                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                    3 Transport Layer 67Comp 361 Spring 2005

                                                                                    Chapter 3 outline

                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                    35 Connection-oriented transport TCP

                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                    3 Transport Layer 68Comp 361 Spring 2005

                                                                                    TCP reliable data transfer

                                                                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                    Retransmissions are triggered by

                                                                                    timeout eventsduplicate acks

                                                                                    Initially consider simplified TCP sender

                                                                                    ignore duplicate acksignore flow control congestion control

                                                                                    3 Transport Layer 69Comp 361 Spring 2005

                                                                                    TCP sender eventsdata rcvd from app

                                                                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                    timeoutretransmit segment that caused timeoutrestart timer

                                                                                    Ack rcvdIf acknowledges previously unackedsegments

                                                                                    update what is known to be ackedstart timer if there are outstanding segments

                                                                                    TCP sender(simplified)

                                                                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                    loop (forever) switch(event)

                                                                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                    smallest sequence numberstart timer

                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                    start timer

                                                                                    end of loop forever

                                                                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                    3 Transport Layer 70Comp 361 Spring 2005

                                                                                    3 Transport Layer 71Comp 361 Spring 2005

                                                                                    TCP retransmission scenariosHost A

                                                                                    Seq=100 20 bytes data

                                                                                    ACK=100

                                                                                    timepremature timeout

                                                                                    Host B

                                                                                    Seq=92 8 bytes data

                                                                                    ACK=120

                                                                                    Seq=92 8 bytes data

                                                                                    Seq=

                                                                                    92 t

                                                                                    imeo

                                                                                    ut

                                                                                    ACK=120

                                                                                    Host A

                                                                                    Seq=92 8 bytes data

                                                                                    ACK=100

                                                                                    loss

                                                                                    tim

                                                                                    eout

                                                                                    lost ACK scenario

                                                                                    Host B

                                                                                    X

                                                                                    Seq=92 8 bytes data

                                                                                    ACK=100

                                                                                    time

                                                                                    SendBase= 120

                                                                                    SendBase= 120

                                                                                    Sendbase= 100

                                                                                    Seq=

                                                                                    92 t

                                                                                    imeo

                                                                                    utSendBase

                                                                                    = 100

                                                                                    3 Transport Layer 72Comp 361 Spring 2005

                                                                                    TCP retransmission scenarios (more)Host A

                                                                                    Seq=92 8 bytes data

                                                                                    ACK=100

                                                                                    loss

                                                                                    tim

                                                                                    eout

                                                                                    Cumulative ACK scenario

                                                                                    Host B

                                                                                    X

                                                                                    Seq=100 20 bytes data

                                                                                    ACK=120

                                                                                    time

                                                                                    SendBase= 120

                                                                                    3 Transport Layer 73Comp 361 Spring 2005

                                                                                    TCP ACK generation [RFC 1122 RFC 2581]

                                                                                    Event at Receiver

                                                                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                    Arrival of segment that partially or completely fills gap

                                                                                    TCP Receiver action

                                                                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                    Immediately send single cumulative ACK ACKing both in-order segments

                                                                                    Immediately send duplicate ACK indicating seq of next expected byte

                                                                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                                                    More on Sender Policies

                                                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                                                    Fast Retransmit

                                                                                    Time-out period often relatively long

                                                                                    long delay before resending lost packet

                                                                                    Detect lost segments via duplicate ACKs

                                                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                    fast retransmit resend segment before timer expires

                                                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                                                    Fast retransmit algorithm

                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                    start timer

                                                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                    resend segment with sequence number y

                                                                                    a duplicate ACK for already ACKed segment

                                                                                    fast retransmit

                                                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                                                    TCP GBN or Selective Repeat

                                                                                    Basic TCP looks a lot like GBN

                                                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                    This looks a lot like Selective Repeat

                                                                                    TCP is a hybrid

                                                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                                                    Chapter 3 outline

                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                    35 Connection-oriented transport TCP

                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                                                    TCP Flow Control

                                                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                    transmitting too muchtoo fast

                                                                                    flow controlreceive side of TCP connection has a receive buffer

                                                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                    app process may be slow at reading from buffer

                                                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                                                    TCP segment structure

                                                                                    source port dest port

                                                                                    32 bits

                                                                                    applicationdata

                                                                                    (variable length)

                                                                                    sequence numberacknowledgement number

                                                                                    Receive windowUrg data pnterchecksum

                                                                                    FSRPAUheadlen

                                                                                    notused

                                                                                    Options (variable length)

                                                                                    URG urgent data (generally not used)

                                                                                    ACK ACK valid

                                                                                    PSH push data now(generally not used)

                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                    commands)

                                                                                    bytes rcvr willingto accept

                                                                                    Internetchecksum

                                                                                    (as in UDP)

                                                                                    countingby bytes of data(not segments)

                                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                                    TCP Flow control how it works

                                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                    LastByteRead]

                                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                    guarantees receive buffer doesnrsquot overflow

                                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                                    Technical Issue

                                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                                    Note on UDP

                                                                                    UDP has no flow control

                                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                                    Chapter 3 outline

                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                    35 Connection-oriented transport TCP

                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                                    TCP Connection Management

                                                                                    Three way handshakeStep 1 client end system sends

                                                                                    TCP SYN control segment to server

                                                                                    specifies client_isn the initial seq No application data

                                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                                    TCP Connection Management (cont)

                                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                    Allocate buffersAllocates buffersCan include application data

                                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                                    server

                                                                                    Connection granted (SYN=1 server_isn

                                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                                    ack=client_isn+1)

                                                                                    ack=server_isn+1

                                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                                    TCP Connection Management (cont)

                                                                                    Closing a connection

                                                                                    client closes socketclientSocketclose()

                                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                    client

                                                                                    FIN

                                                                                    server

                                                                                    ACK

                                                                                    ACK

                                                                                    FIN

                                                                                    close

                                                                                    close

                                                                                    closed

                                                                                    tim

                                                                                    ed w

                                                                                    ait

                                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                                    TCP Connection Management (cont)

                                                                                    Step 3 client receives FIN replies with ACK

                                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                    Closes down after timed-wait

                                                                                    Step 4 server receives ACK Connection closed

                                                                                    Note with small modification can handle simultaneous FINs

                                                                                    client

                                                                                    FIN

                                                                                    server

                                                                                    ACK

                                                                                    ACK

                                                                                    FIN

                                                                                    closing

                                                                                    closing

                                                                                    closed

                                                                                    tim

                                                                                    ed w

                                                                                    ait

                                                                                    closed

                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                    TCP Connection Management (cont)

                                                                                    ExampleTCP serverlifecycle

                                                                                    Example TCP clientlifecycle

                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                    A few special cases

                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                    Chapter 3 outline

                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                    35 Connection-oriented transport TCP

                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                    Principles of Congestion Control

                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                    a top-10 problem

                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                    large delays when congestedmaximum achievable throughput

                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                    Causescosts of congestion scenario 2

                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                    λin λout=

                                                                                    λin λoutgtλ

                                                                                    inλout

                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                    (c)(a) (b)

                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                    λin

                                                                                    Q what happens as and increase λ

                                                                                    in

                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                    Causescosts of congestion scenario 3

                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                    Approaches towards congestion control

                                                                                    Two broad approaches towards congestion control

                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                    Case study ATM ABR congestion control

                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                    small exception ndash see next page

                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                    sender should use available bandwidth

                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                    Case study ATM ABR congestion control

                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                    Chapter 3 outline

                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                    35 Connection-oriented transport TCP

                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                    Congwin

                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                    throughput = w MSSRTT Bytessec

                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                    cut CongWin in half after loss event

                                                                                    8 Kbytes

                                                                                    16 Kbytes

                                                                                    24 Kbytes

                                                                                    time

                                                                                    congestionwindow

                                                                                    Long-lived TCP connection

                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                    TCP Slow Start

                                                                                    When connection begins CongWin = 1 MSS

                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                    desirable to quickly ramp up to respectable rate

                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                    TCP Slow Start (more)

                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                    Host A

                                                                                    one segment

                                                                                    RTT

                                                                                    Host B

                                                                                    time

                                                                                    two segments

                                                                                    four segments

                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                    Summary TCP Congestion Control

                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                    The Big Picture

                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                    ACK receipt for previously unackeddata

                                                                                    Slow Start (SS)

                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                    ACK receipt for previously unackeddata

                                                                                    CongestionAvoidance (CA)

                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                    Loss event detected by triple duplicate ACK

                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                    Enter slow start

                                                                                    Duplicate ACK

                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                    CongWin and Threshold not changed

                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                    TCP throughput

                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                    TCP Futures

                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                    LRTTMSSsdot221

                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                    TCP connection 1

                                                                                    bottleneckrouter

                                                                                    capacity R

                                                                                    TCP connection 2

                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                    Why is TCP fairTwo competing sessions

                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                    R

                                                                                    R

                                                                                    equal bandwidth share

                                                                                    Connection 1 throughput

                                                                                    Conn

                                                                                    ecti

                                                                                    on 2

                                                                                    thr

                                                                                    ough

                                                                                    p ut

                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                    Fairness (more)Fairness and UDP

                                                                                    Multimedia apps often do not use TCP

                                                                                    do not want rate throttled by congestion control

                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                    TCP Latency ModelingNotation assumptions

                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                    modeling slow start

                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                    Fixed Congestion Window (W)Two cases

                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                    Fixed congestion window (1)

                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                    latency = 2RTT + OR

                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                    Fixed congestion window (2)

                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                    Will show that the delay for one object is

                                                                                    RS

                                                                                    RSRTTP

                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                    ⎤⎢⎣⎡ +++=

                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                    - and K is the number of windows that cover the object

                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                    RTT

                                                                                    initiate TCPconnection

                                                                                    requestobject

                                                                                    first window= SR

                                                                                    second window= 2SR

                                                                                    third window= 4SR

                                                                                    fourth window= 8SR

                                                                                    completetransmissionobject

                                                                                    delivered

                                                                                    time atclient

                                                                                    time atserver

                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                    Server idles P=2 times

                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                    Server idles P = minK-1Q times

                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                    TCP Latency Modeling (3)

                                                                                    ementacknowledg receivesserver until

                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                    RS

                                                                                    RSRTTPRTT

                                                                                    RO

                                                                                    RSRTT

                                                                                    RSRTT

                                                                                    RO

                                                                                    idleTimeRTTRO

                                                                                    P

                                                                                    kP

                                                                                    k

                                                                                    P

                                                                                    pp

                                                                                    )12(][2

                                                                                    ]2[2

                                                                                    2delay

                                                                                    1

                                                                                    1

                                                                                    1

                                                                                    minusminus+++=

                                                                                    minus+++=

                                                                                    ++=

                                                                                    minus

                                                                                    =

                                                                                    =

                                                                                    sum

                                                                                    sum

                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                    RS k =⎥⎦

                                                                                    ⎤⎢⎣⎡ minus+

                                                                                    +minus

                                                                                    window kth the transmit totime2 1 =minus

                                                                                    RSk

                                                                                    RTT

                                                                                    initiate TCPconnection

                                                                                    requestobject

                                                                                    first window= SR

                                                                                    second window= 2SR

                                                                                    third window= 4SR

                                                                                    fourth window= 8SR

                                                                                    completetransmissionobject

                                                                                    delivered

                                                                                    time atclient

                                                                                    time atserver

                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                    How do we calculate K

                                                                                    ⎥⎥⎤

                                                                                    ⎢⎢⎡ +=

                                                                                    +ge=

                                                                                    geminus=

                                                                                    ge+++=

                                                                                    ge+++=minus

                                                                                    minus

                                                                                    )1(log

                                                                                    )1(logmin

                                                                                    12min

                                                                                    222min222min

                                                                                    2

                                                                                    2

                                                                                    110

                                                                                    110

                                                                                    SO

                                                                                    SOkk

                                                                                    SOk

                                                                                    SOkOSSSkK

                                                                                    k

                                                                                    k

                                                                                    k

                                                                                    L

                                                                                    L

                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                    HTTP ModelingAssume Web page consists of

                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                    02468

                                                                                    101214161820

                                                                                    28Kbps

                                                                                    100Kbps

                                                                                    1 Mbps 10Mbps

                                                                                    non-persistent

                                                                                    persistent

                                                                                    parallel non-persistent

                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                    HTTP Response time (in seconds)

                                                                                    0

                                                                                    10

                                                                                    20

                                                                                    30

                                                                                    40

                                                                                    50

                                                                                    60

                                                                                    70

                                                                                    28Kbps

                                                                                    100Kbps

                                                                                    1 Mbps 10Mbps

                                                                                    non-persistent

                                                                                    persistent

                                                                                    parallel non-persistent

                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                    instantiation and implementation in the Internet

                                                                                    UDPTCP

                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                    • Chapter 3 outline
                                                                                    • Transport services and protocols
                                                                                    • Transport vs network layer
                                                                                    • Transport-layer protocols
                                                                                    • Chapter 3 outline
                                                                                    • Multiplexingdemultiplexing
                                                                                    • Multiplexingdemultiplexing
                                                                                    • How demultiplexing works
                                                                                    • Connectionless demultiplexing
                                                                                    • Connectionless demux (cont)
                                                                                    • Connection-oriented demux
                                                                                    • Connection-oriented demux (cont)
                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                    • Chapter 3 outline
                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                    • UDP more
                                                                                    • UDP checksum
                                                                                    • Chapter 3 outline
                                                                                    • Principles of Reliable data transfer
                                                                                    • Reliable data transfer getting started
                                                                                    • Reliable data transfer getting started
                                                                                    • Incremental Improvements
                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                    • Rdt20 channel with bit errors
                                                                                    • rdt20 FSM specification
                                                                                    • rdt20 operation with no errors
                                                                                    • rdt20 error scenario
                                                                                    • rdt20 has a fatal flaw
                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                    • rdt21 discussion
                                                                                    • rdt22 a NAK-free protocol
                                                                                    • rdt22 sender receiver fragments
                                                                                    • rdt30 channels with errors and loss
                                                                                    • rdt30 sender
                                                                                    • rdt30 in action
                                                                                    • rdt30 in action
                                                                                    • Performance of rdt30
                                                                                    • rdt30 stop-and-wait operation
                                                                                    • Pipelined protocols
                                                                                    • Pipelined protocols
                                                                                    • Pipelining increased utilization
                                                                                    • Go-Back-N
                                                                                    • GBN Sender
                                                                                    • GBN sender extended FSM
                                                                                    • GBN receiver extended FSM
                                                                                    • More on receiver
                                                                                    • GBN inaction
                                                                                    • Selective Repeat
                                                                                    • Selective repeat sender receiver windows
                                                                                    • Selective repeat
                                                                                    • Selective repeat in action
                                                                                    • Selective repeat dilemma
                                                                                    • Chapter 3 outline
                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                    • More TCP Details
                                                                                    • Even More TCP Details
                                                                                    • TCP segment structure
                                                                                    • TCP seq rsquos and ACKs
                                                                                    • TCP Round Trip Time and Timeout
                                                                                    • TCP Round Trip Time and Timeout
                                                                                    • Example RTT estimation
                                                                                    • TCP Round Trip Time and Timeout
                                                                                    • Chapter 3 outline
                                                                                    • TCP reliable data transfer
                                                                                    • TCP sender events
                                                                                    • TCP sender(simplified)
                                                                                    • TCP retransmission scenarios
                                                                                    • TCP retransmission scenarios (more)
                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                    • More on Sender Policies
                                                                                    • Fast Retransmit
                                                                                    • Fast retransmit algorithm
                                                                                    • TCP GBN or Selective Repeat
                                                                                    • Chapter 3 outline
                                                                                    • TCP Flow Control
                                                                                    • TCP Flow Control
                                                                                    • TCP segment structure
                                                                                    • TCP Flow control how it works
                                                                                    • Technical Issue
                                                                                    • Chapter 3 outline
                                                                                    • TCP Connection Management
                                                                                    • TCP Connection Management (cont)
                                                                                    • TCP Connection Management (cont)
                                                                                    • TCP Connection Management (cont)
                                                                                    • TCP Connection Management (cont)
                                                                                    • A few special cases
                                                                                    • Chapter 3 outline
                                                                                    • Principles of Congestion Control
                                                                                    • Causescosts of congestion scenario 1
                                                                                    • Causescosts of congestion scenario 2
                                                                                    • Causescosts of congestion scenario 3
                                                                                    • Causescosts of congestion scenario 3
                                                                                    • Approaches towards congestion control
                                                                                    • Case study ATM ABR congestion control
                                                                                    • Case study ATM ABR congestion control
                                                                                    • Chapter 3 outline
                                                                                    • TCP Congestion Control
                                                                                    • TCP AIMD
                                                                                    • TCP Slow Start
                                                                                    • TCP Slow Start (more)
                                                                                    • Summary TCP Congestion Control
                                                                                    • The Big Picture
                                                                                    • TCP sender congestion control
                                                                                    • TCP throughput
                                                                                    • TCP Futures
                                                                                    • TCP Fairness
                                                                                    • Why is TCP fair
                                                                                    • Fairness (more)
                                                                                    • TCP Latency Modeling
                                                                                    • Fixed Congestion Window (W)
                                                                                    • Fixed congestion window (1)
                                                                                    • Fixed congestion window (2)
                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                    • TCP Latency Modeling (3)
                                                                                    • TCP Latency Modeling (4)
                                                                                    • HTTP Modeling
                                                                                    • Chapter 3 Summary

                                                                                      3 Transport Layer 43Comp 361 Spring 2005

                                                                                      Pipelined protocols

                                                                                      Advantage much better bandwidth utilization than stop-and-wait

                                                                                      Disadvantage More complicated to deal with reliability issues eg corrupted lost out of order data

                                                                                      Two generic approaches to solving thisbull go-Back-N protocolsbull selective repeat protocols

                                                                                      Note TCP is not exactly either

                                                                                      Pipelining increased utilization

                                                                                      first packet bit transmitted t = 0

                                                                                      sender receiver

                                                                                      RTT

                                                                                      last bit transmitted t = L R

                                                                                      first packet bit arriveslast packet bit arrives send ACK

                                                                                      ACK arrives send next packet t = RTT + L R

                                                                                      last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                                      U sender =

                                                                                      02430008

                                                                                      = 00008 3 L R RTT + L R

                                                                                      =

                                                                                      Increase utilizationby a factor of 3

                                                                                      3 Transport Layer 44Comp 361 Spring 2005

                                                                                      3 Transport Layer 45Comp 361 Spring 2005

                                                                                      Go-Back-NSender

                                                                                      k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                                      ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                                      Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                                      3 Transport Layer 46Comp 361 Spring 2005

                                                                                      GBN Sender

                                                                                      rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                      Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                      Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                      This is only event that triggers resend

                                                                                      3 Transport Layer 47Comp 361 Spring 2005

                                                                                      GBN sender extended FSMrdt_send(data)

                                                                                      Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                      timeout

                                                                                      if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                      start_timernextseqnum++

                                                                                      elserefuse_data(data)

                                                                                      base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                      stop_timerelse

                                                                                      start_timer

                                                                                      rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                      base=1nextseqnum=1

                                                                                      rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                      Λ

                                                                                      3 Transport Layer 48Comp 361 Spring 2005

                                                                                      GBN receiver extended FSM

                                                                                      Wait

                                                                                      udt_send(sndpkt)default

                                                                                      rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                      extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                      expectedseqnum=1sndpkt =

                                                                                      make_pkt(0ACKchksum)

                                                                                      Λ

                                                                                      If expected packet receivedSend ACK and deliver packet upstairs

                                                                                      If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                      3 Transport Layer 49Comp 361 Spring 2005

                                                                                      More on receiver

                                                                                      The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                      3 Transport Layer 50Comp 361 Spring 2005

                                                                                      GBN inaction

                                                                                      GBN is easy to code but might have performance problems

                                                                                      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                      3 Transport Layer 51Comp 361 Spring 2005

                                                                                      3 Transport Layer 52Comp 361 Spring 2005

                                                                                      Selective Repeat

                                                                                      receiver individually acknowledges all correctly received pkts

                                                                                      buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                      sender only resends pkts for which ACK not received

                                                                                      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                      3 Transport Layer 53Comp 361 Spring 2005

                                                                                      Selective repeat sender receiver windows

                                                                                      3 Transport Layer 54Comp 361 Spring 2005

                                                                                      Selective repeat

                                                                                      pkt n in [rcvbase rcvbase+N-1]

                                                                                      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                      pkt n in [rcvbase-Nrcvbase-1]

                                                                                      ACK(n) (note this is a reACK)

                                                                                      otherwiseignore

                                                                                      receiverdata from above

                                                                                      if next available seq in window send pkt

                                                                                      timeout(n)resend pkt n restart timer

                                                                                      ACK(n) in [sendbasesendbase+N]

                                                                                      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                      sender

                                                                                      3 Transport Layer 55Comp 361 Spring 2005

                                                                                      Selective repeat in action

                                                                                      3 Transport Layer 56Comp 361 Spring 2005

                                                                                      Selective repeatdilemma

                                                                                      Example seq rsquos 0 1 2 3window size=3

                                                                                      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                      Q what is relationship between seq size and window size

                                                                                      3 Transport Layer 57Comp 361 Spring 2005

                                                                                      Chapter 3 outline

                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                      35 Connection-oriented transport TCP

                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                      3 Transport Layer 58Comp 361 Spring 2005

                                                                                      TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                      full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                      flow controlledsender will not overwhelm receiver

                                                                                      point-to-pointone sender one receiver

                                                                                      reliable in-order byte steam

                                                                                      no ldquomessage boundariesrdquopipelined

                                                                                      TCP congestion and flow control set window size

                                                                                      send amp receive buffers

                                                                                      socketdoor

                                                                                      TCPsend buffer

                                                                                      TCPreceive buffer

                                                                                      socketdoor

                                                                                      segment

                                                                                      applicationwrites data

                                                                                      applicationreads data

                                                                                      3 Transport Layer 59Comp 361 Spring 2005

                                                                                      More TCP DetailsMaximum Segment Size (MSS)

                                                                                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                      Application Data + TCP Header = TCP Segment

                                                                                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                      (again no payload)Client responds with third special segment

                                                                                      This can contain payload

                                                                                      3 Transport Layer 60Comp 361 Spring 2005

                                                                                      Even More TCP Details

                                                                                      A TCP connection between client and server creates in both client and server

                                                                                      (i) buffers(ii) variables and

                                                                                      (iii) a socket connection to process

                                                                                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                      any of the network elements between the host and server

                                                                                      3 Transport Layer 61Comp 361 Spring 2005

                                                                                      TCP segment structure

                                                                                      source port dest port

                                                                                      32 bits

                                                                                      applicationdata

                                                                                      (variable length)

                                                                                      sequence numberacknowledgement number

                                                                                      Receive windowUrg data pnterchecksum

                                                                                      FSRPAUheadlen

                                                                                      notused

                                                                                      Options (variable length)

                                                                                      URG urgent data (generally not used)

                                                                                      ACK ACK valid

                                                                                      PSH push data now(generally not used)

                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                      commands)

                                                                                      bytes rcvr willingto accept

                                                                                      Internetchecksum

                                                                                      (as in UDP)

                                                                                      countingby bytes of data(not segments)

                                                                                      3 Transport Layer 62Comp 361 Spring 2005

                                                                                      TCP seq rsquos and ACKsSeq rsquos

                                                                                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                      ACKsseq of next byte expected from other sidecumulative ACK

                                                                                      Q how receiver handles out-of-order segments

                                                                                      A TCP spec doesnrsquot say - up to implementer

                                                                                      Host BHost A

                                                                                      Seq=42 ACK=79 data = lsquoCrsquo

                                                                                      Seq=79 ACK=43 data = lsquoCrsquo

                                                                                      Seq=43 ACK=80

                                                                                      Usertypes

                                                                                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                      back lsquoCrsquo

                                                                                      host ACKsreceipt

                                                                                      of echoedlsquoCrsquo

                                                                                      timesimple telnet scenario

                                                                                      3 Transport Layer 63Comp 361 Spring 2005

                                                                                      TCP Round Trip Time and Timeout

                                                                                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                      average several recent measurements not just current SampleRTT

                                                                                      Q how to set TCP timeout valuelonger than RTT

                                                                                      but RTT variestoo short premature timeout

                                                                                      unnecessary retransmissions

                                                                                      too long slow reaction to segment loss

                                                                                      3 Transport Layer 64Comp 361 Spring 2005

                                                                                      TCP Round Trip Time and Timeout

                                                                                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                      3 Transport Layer 65Comp 361 Spring 2005

                                                                                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                      100

                                                                                      150

                                                                                      200

                                                                                      250

                                                                                      300

                                                                                      350

                                                                                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                      time (seconnds)

                                                                                      RTT

                                                                                      (mill

                                                                                      iseco

                                                                                      nds)

                                                                                      SampleRTT Estimated RTT

                                                                                      3 Transport Layer 66Comp 361 Spring 2005

                                                                                      TCP Round Trip Time and Timeout

                                                                                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                      (typically β = 025)

                                                                                      Then set timeout interval

                                                                                      TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                      3 Transport Layer 67Comp 361 Spring 2005

                                                                                      Chapter 3 outline

                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                      35 Connection-oriented transport TCP

                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                      3 Transport Layer 68Comp 361 Spring 2005

                                                                                      TCP reliable data transfer

                                                                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                      Retransmissions are triggered by

                                                                                      timeout eventsduplicate acks

                                                                                      Initially consider simplified TCP sender

                                                                                      ignore duplicate acksignore flow control congestion control

                                                                                      3 Transport Layer 69Comp 361 Spring 2005

                                                                                      TCP sender eventsdata rcvd from app

                                                                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                      timeoutretransmit segment that caused timeoutrestart timer

                                                                                      Ack rcvdIf acknowledges previously unackedsegments

                                                                                      update what is known to be ackedstart timer if there are outstanding segments

                                                                                      TCP sender(simplified)

                                                                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                      loop (forever) switch(event)

                                                                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                      smallest sequence numberstart timer

                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                      start timer

                                                                                      end of loop forever

                                                                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                      3 Transport Layer 70Comp 361 Spring 2005

                                                                                      3 Transport Layer 71Comp 361 Spring 2005

                                                                                      TCP retransmission scenariosHost A

                                                                                      Seq=100 20 bytes data

                                                                                      ACK=100

                                                                                      timepremature timeout

                                                                                      Host B

                                                                                      Seq=92 8 bytes data

                                                                                      ACK=120

                                                                                      Seq=92 8 bytes data

                                                                                      Seq=

                                                                                      92 t

                                                                                      imeo

                                                                                      ut

                                                                                      ACK=120

                                                                                      Host A

                                                                                      Seq=92 8 bytes data

                                                                                      ACK=100

                                                                                      loss

                                                                                      tim

                                                                                      eout

                                                                                      lost ACK scenario

                                                                                      Host B

                                                                                      X

                                                                                      Seq=92 8 bytes data

                                                                                      ACK=100

                                                                                      time

                                                                                      SendBase= 120

                                                                                      SendBase= 120

                                                                                      Sendbase= 100

                                                                                      Seq=

                                                                                      92 t

                                                                                      imeo

                                                                                      utSendBase

                                                                                      = 100

                                                                                      3 Transport Layer 72Comp 361 Spring 2005

                                                                                      TCP retransmission scenarios (more)Host A

                                                                                      Seq=92 8 bytes data

                                                                                      ACK=100

                                                                                      loss

                                                                                      tim

                                                                                      eout

                                                                                      Cumulative ACK scenario

                                                                                      Host B

                                                                                      X

                                                                                      Seq=100 20 bytes data

                                                                                      ACK=120

                                                                                      time

                                                                                      SendBase= 120

                                                                                      3 Transport Layer 73Comp 361 Spring 2005

                                                                                      TCP ACK generation [RFC 1122 RFC 2581]

                                                                                      Event at Receiver

                                                                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                      Arrival of segment that partially or completely fills gap

                                                                                      TCP Receiver action

                                                                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                      Immediately send single cumulative ACK ACKing both in-order segments

                                                                                      Immediately send duplicate ACK indicating seq of next expected byte

                                                                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                      3 Transport Layer 74Comp 361 Spring 2005

                                                                                      More on Sender Policies

                                                                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                                                      Fast Retransmit

                                                                                      Time-out period often relatively long

                                                                                      long delay before resending lost packet

                                                                                      Detect lost segments via duplicate ACKs

                                                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                      fast retransmit resend segment before timer expires

                                                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                                                      Fast retransmit algorithm

                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                      start timer

                                                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                      resend segment with sequence number y

                                                                                      a duplicate ACK for already ACKed segment

                                                                                      fast retransmit

                                                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                                                      TCP GBN or Selective Repeat

                                                                                      Basic TCP looks a lot like GBN

                                                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                      This looks a lot like Selective Repeat

                                                                                      TCP is a hybrid

                                                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                                                      Chapter 3 outline

                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                      35 Connection-oriented transport TCP

                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                                                      TCP Flow Control

                                                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                      transmitting too muchtoo fast

                                                                                      flow controlreceive side of TCP connection has a receive buffer

                                                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                      app process may be slow at reading from buffer

                                                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                                                      TCP segment structure

                                                                                      source port dest port

                                                                                      32 bits

                                                                                      applicationdata

                                                                                      (variable length)

                                                                                      sequence numberacknowledgement number

                                                                                      Receive windowUrg data pnterchecksum

                                                                                      FSRPAUheadlen

                                                                                      notused

                                                                                      Options (variable length)

                                                                                      URG urgent data (generally not used)

                                                                                      ACK ACK valid

                                                                                      PSH push data now(generally not used)

                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                      commands)

                                                                                      bytes rcvr willingto accept

                                                                                      Internetchecksum

                                                                                      (as in UDP)

                                                                                      countingby bytes of data(not segments)

                                                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                                                      TCP Flow control how it works

                                                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                      LastByteRead]

                                                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                      guarantees receive buffer doesnrsquot overflow

                                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                                      Technical Issue

                                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                                      Note on UDP

                                                                                      UDP has no flow control

                                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                                      Chapter 3 outline

                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                      35 Connection-oriented transport TCP

                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                                      TCP Connection Management

                                                                                      Three way handshakeStep 1 client end system sends

                                                                                      TCP SYN control segment to server

                                                                                      specifies client_isn the initial seq No application data

                                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                                      TCP Connection Management (cont)

                                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                      Allocate buffersAllocates buffersCan include application data

                                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                                      server

                                                                                      Connection granted (SYN=1 server_isn

                                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                                      ack=client_isn+1)

                                                                                      ack=server_isn+1

                                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                                      TCP Connection Management (cont)

                                                                                      Closing a connection

                                                                                      client closes socketclientSocketclose()

                                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                      client

                                                                                      FIN

                                                                                      server

                                                                                      ACK

                                                                                      ACK

                                                                                      FIN

                                                                                      close

                                                                                      close

                                                                                      closed

                                                                                      tim

                                                                                      ed w

                                                                                      ait

                                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                                      TCP Connection Management (cont)

                                                                                      Step 3 client receives FIN replies with ACK

                                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                      Closes down after timed-wait

                                                                                      Step 4 server receives ACK Connection closed

                                                                                      Note with small modification can handle simultaneous FINs

                                                                                      client

                                                                                      FIN

                                                                                      server

                                                                                      ACK

                                                                                      ACK

                                                                                      FIN

                                                                                      closing

                                                                                      closing

                                                                                      closed

                                                                                      tim

                                                                                      ed w

                                                                                      ait

                                                                                      closed

                                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                                      TCP Connection Management (cont)

                                                                                      ExampleTCP serverlifecycle

                                                                                      Example TCP clientlifecycle

                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                      A few special cases

                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                      Chapter 3 outline

                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                      35 Connection-oriented transport TCP

                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                      Principles of Congestion Control

                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                      a top-10 problem

                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                      large delays when congestedmaximum achievable throughput

                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                      Causescosts of congestion scenario 2

                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                      λin λout=

                                                                                      λin λoutgtλ

                                                                                      inλout

                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                      (c)(a) (b)

                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                      λin

                                                                                      Q what happens as and increase λ

                                                                                      in

                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                      Causescosts of congestion scenario 3

                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                      Approaches towards congestion control

                                                                                      Two broad approaches towards congestion control

                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                      Case study ATM ABR congestion control

                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                      small exception ndash see next page

                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                      sender should use available bandwidth

                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                      Case study ATM ABR congestion control

                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                      Chapter 3 outline

                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                      35 Connection-oriented transport TCP

                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                      Congwin

                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                      throughput = w MSSRTT Bytessec

                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                      cut CongWin in half after loss event

                                                                                      8 Kbytes

                                                                                      16 Kbytes

                                                                                      24 Kbytes

                                                                                      time

                                                                                      congestionwindow

                                                                                      Long-lived TCP connection

                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                      TCP Slow Start

                                                                                      When connection begins CongWin = 1 MSS

                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                      desirable to quickly ramp up to respectable rate

                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                      TCP Slow Start (more)

                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                      Host A

                                                                                      one segment

                                                                                      RTT

                                                                                      Host B

                                                                                      time

                                                                                      two segments

                                                                                      four segments

                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                      Summary TCP Congestion Control

                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                      The Big Picture

                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                      ACK receipt for previously unackeddata

                                                                                      Slow Start (SS)

                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                      ACK receipt for previously unackeddata

                                                                                      CongestionAvoidance (CA)

                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                      Loss event detected by triple duplicate ACK

                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                      Enter slow start

                                                                                      Duplicate ACK

                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                      CongWin and Threshold not changed

                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                      TCP throughput

                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                      TCP Futures

                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                      LRTTMSSsdot221

                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                      TCP connection 1

                                                                                      bottleneckrouter

                                                                                      capacity R

                                                                                      TCP connection 2

                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                      Why is TCP fairTwo competing sessions

                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                      R

                                                                                      R

                                                                                      equal bandwidth share

                                                                                      Connection 1 throughput

                                                                                      Conn

                                                                                      ecti

                                                                                      on 2

                                                                                      thr

                                                                                      ough

                                                                                      p ut

                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                      Fairness (more)Fairness and UDP

                                                                                      Multimedia apps often do not use TCP

                                                                                      do not want rate throttled by congestion control

                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                      TCP Latency ModelingNotation assumptions

                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                      modeling slow start

                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                      Fixed Congestion Window (W)Two cases

                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                      Fixed congestion window (1)

                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                      latency = 2RTT + OR

                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                      Fixed congestion window (2)

                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                      Will show that the delay for one object is

                                                                                      RS

                                                                                      RSRTTP

                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                      ⎤⎢⎣⎡ +++=

                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                      - and K is the number of windows that cover the object

                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                      RTT

                                                                                      initiate TCPconnection

                                                                                      requestobject

                                                                                      first window= SR

                                                                                      second window= 2SR

                                                                                      third window= 4SR

                                                                                      fourth window= 8SR

                                                                                      completetransmissionobject

                                                                                      delivered

                                                                                      time atclient

                                                                                      time atserver

                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                      Server idles P=2 times

                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                      Server idles P = minK-1Q times

                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                      TCP Latency Modeling (3)

                                                                                      ementacknowledg receivesserver until

                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                      RS

                                                                                      RSRTTPRTT

                                                                                      RO

                                                                                      RSRTT

                                                                                      RSRTT

                                                                                      RO

                                                                                      idleTimeRTTRO

                                                                                      P

                                                                                      kP

                                                                                      k

                                                                                      P

                                                                                      pp

                                                                                      )12(][2

                                                                                      ]2[2

                                                                                      2delay

                                                                                      1

                                                                                      1

                                                                                      1

                                                                                      minusminus+++=

                                                                                      minus+++=

                                                                                      ++=

                                                                                      minus

                                                                                      =

                                                                                      =

                                                                                      sum

                                                                                      sum

                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                      RS k =⎥⎦

                                                                                      ⎤⎢⎣⎡ minus+

                                                                                      +minus

                                                                                      window kth the transmit totime2 1 =minus

                                                                                      RSk

                                                                                      RTT

                                                                                      initiate TCPconnection

                                                                                      requestobject

                                                                                      first window= SR

                                                                                      second window= 2SR

                                                                                      third window= 4SR

                                                                                      fourth window= 8SR

                                                                                      completetransmissionobject

                                                                                      delivered

                                                                                      time atclient

                                                                                      time atserver

                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                      How do we calculate K

                                                                                      ⎥⎥⎤

                                                                                      ⎢⎢⎡ +=

                                                                                      +ge=

                                                                                      geminus=

                                                                                      ge+++=

                                                                                      ge+++=minus

                                                                                      minus

                                                                                      )1(log

                                                                                      )1(logmin

                                                                                      12min

                                                                                      222min222min

                                                                                      2

                                                                                      2

                                                                                      110

                                                                                      110

                                                                                      SO

                                                                                      SOkk

                                                                                      SOk

                                                                                      SOkOSSSkK

                                                                                      k

                                                                                      k

                                                                                      k

                                                                                      L

                                                                                      L

                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                      HTTP ModelingAssume Web page consists of

                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                      02468

                                                                                      101214161820

                                                                                      28Kbps

                                                                                      100Kbps

                                                                                      1 Mbps 10Mbps

                                                                                      non-persistent

                                                                                      persistent

                                                                                      parallel non-persistent

                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                      HTTP Response time (in seconds)

                                                                                      0

                                                                                      10

                                                                                      20

                                                                                      30

                                                                                      40

                                                                                      50

                                                                                      60

                                                                                      70

                                                                                      28Kbps

                                                                                      100Kbps

                                                                                      1 Mbps 10Mbps

                                                                                      non-persistent

                                                                                      persistent

                                                                                      parallel non-persistent

                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                      instantiation and implementation in the Internet

                                                                                      UDPTCP

                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                      • Chapter 3 outline
                                                                                      • Transport services and protocols
                                                                                      • Transport vs network layer
                                                                                      • Transport-layer protocols
                                                                                      • Chapter 3 outline
                                                                                      • Multiplexingdemultiplexing
                                                                                      • Multiplexingdemultiplexing
                                                                                      • How demultiplexing works
                                                                                      • Connectionless demultiplexing
                                                                                      • Connectionless demux (cont)
                                                                                      • Connection-oriented demux
                                                                                      • Connection-oriented demux (cont)
                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                      • Chapter 3 outline
                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                      • UDP more
                                                                                      • UDP checksum
                                                                                      • Chapter 3 outline
                                                                                      • Principles of Reliable data transfer
                                                                                      • Reliable data transfer getting started
                                                                                      • Reliable data transfer getting started
                                                                                      • Incremental Improvements
                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                      • Rdt20 channel with bit errors
                                                                                      • rdt20 FSM specification
                                                                                      • rdt20 operation with no errors
                                                                                      • rdt20 error scenario
                                                                                      • rdt20 has a fatal flaw
                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                      • rdt21 discussion
                                                                                      • rdt22 a NAK-free protocol
                                                                                      • rdt22 sender receiver fragments
                                                                                      • rdt30 channels with errors and loss
                                                                                      • rdt30 sender
                                                                                      • rdt30 in action
                                                                                      • rdt30 in action
                                                                                      • Performance of rdt30
                                                                                      • rdt30 stop-and-wait operation
                                                                                      • Pipelined protocols
                                                                                      • Pipelined protocols
                                                                                      • Pipelining increased utilization
                                                                                      • Go-Back-N
                                                                                      • GBN Sender
                                                                                      • GBN sender extended FSM
                                                                                      • GBN receiver extended FSM
                                                                                      • More on receiver
                                                                                      • GBN inaction
                                                                                      • Selective Repeat
                                                                                      • Selective repeat sender receiver windows
                                                                                      • Selective repeat
                                                                                      • Selective repeat in action
                                                                                      • Selective repeat dilemma
                                                                                      • Chapter 3 outline
                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                      • More TCP Details
                                                                                      • Even More TCP Details
                                                                                      • TCP segment structure
                                                                                      • TCP seq rsquos and ACKs
                                                                                      • TCP Round Trip Time and Timeout
                                                                                      • TCP Round Trip Time and Timeout
                                                                                      • Example RTT estimation
                                                                                      • TCP Round Trip Time and Timeout
                                                                                      • Chapter 3 outline
                                                                                      • TCP reliable data transfer
                                                                                      • TCP sender events
                                                                                      • TCP sender(simplified)
                                                                                      • TCP retransmission scenarios
                                                                                      • TCP retransmission scenarios (more)
                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                      • More on Sender Policies
                                                                                      • Fast Retransmit
                                                                                      • Fast retransmit algorithm
                                                                                      • TCP GBN or Selective Repeat
                                                                                      • Chapter 3 outline
                                                                                      • TCP Flow Control
                                                                                      • TCP Flow Control
                                                                                      • TCP segment structure
                                                                                      • TCP Flow control how it works
                                                                                      • Technical Issue
                                                                                      • Chapter 3 outline
                                                                                      • TCP Connection Management
                                                                                      • TCP Connection Management (cont)
                                                                                      • TCP Connection Management (cont)
                                                                                      • TCP Connection Management (cont)
                                                                                      • TCP Connection Management (cont)
                                                                                      • A few special cases
                                                                                      • Chapter 3 outline
                                                                                      • Principles of Congestion Control
                                                                                      • Causescosts of congestion scenario 1
                                                                                      • Causescosts of congestion scenario 2
                                                                                      • Causescosts of congestion scenario 3
                                                                                      • Causescosts of congestion scenario 3
                                                                                      • Approaches towards congestion control
                                                                                      • Case study ATM ABR congestion control
                                                                                      • Case study ATM ABR congestion control
                                                                                      • Chapter 3 outline
                                                                                      • TCP Congestion Control
                                                                                      • TCP AIMD
                                                                                      • TCP Slow Start
                                                                                      • TCP Slow Start (more)
                                                                                      • Summary TCP Congestion Control
                                                                                      • The Big Picture
                                                                                      • TCP sender congestion control
                                                                                      • TCP throughput
                                                                                      • TCP Futures
                                                                                      • TCP Fairness
                                                                                      • Why is TCP fair
                                                                                      • Fairness (more)
                                                                                      • TCP Latency Modeling
                                                                                      • Fixed Congestion Window (W)
                                                                                      • Fixed congestion window (1)
                                                                                      • Fixed congestion window (2)
                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                      • TCP Latency Modeling (3)
                                                                                      • TCP Latency Modeling (4)
                                                                                      • HTTP Modeling
                                                                                      • Chapter 3 Summary

                                                                                        Pipelining increased utilization

                                                                                        first packet bit transmitted t = 0

                                                                                        sender receiver

                                                                                        RTT

                                                                                        last bit transmitted t = L R

                                                                                        first packet bit arriveslast packet bit arrives send ACK

                                                                                        ACK arrives send next packet t = RTT + L R

                                                                                        last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK

                                                                                        U sender =

                                                                                        02430008

                                                                                        = 00008 3 L R RTT + L R

                                                                                        =

                                                                                        Increase utilizationby a factor of 3

                                                                                        3 Transport Layer 44Comp 361 Spring 2005

                                                                                        3 Transport Layer 45Comp 361 Spring 2005

                                                                                        Go-Back-NSender

                                                                                        k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                                        ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                                        Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                                        3 Transport Layer 46Comp 361 Spring 2005

                                                                                        GBN Sender

                                                                                        rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                        Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                        Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                        This is only event that triggers resend

                                                                                        3 Transport Layer 47Comp 361 Spring 2005

                                                                                        GBN sender extended FSMrdt_send(data)

                                                                                        Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                        timeout

                                                                                        if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                        start_timernextseqnum++

                                                                                        elserefuse_data(data)

                                                                                        base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                        stop_timerelse

                                                                                        start_timer

                                                                                        rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                        base=1nextseqnum=1

                                                                                        rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                        Λ

                                                                                        3 Transport Layer 48Comp 361 Spring 2005

                                                                                        GBN receiver extended FSM

                                                                                        Wait

                                                                                        udt_send(sndpkt)default

                                                                                        rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                        extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                        expectedseqnum=1sndpkt =

                                                                                        make_pkt(0ACKchksum)

                                                                                        Λ

                                                                                        If expected packet receivedSend ACK and deliver packet upstairs

                                                                                        If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                        3 Transport Layer 49Comp 361 Spring 2005

                                                                                        More on receiver

                                                                                        The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                        3 Transport Layer 50Comp 361 Spring 2005

                                                                                        GBN inaction

                                                                                        GBN is easy to code but might have performance problems

                                                                                        In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                        Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                        3 Transport Layer 51Comp 361 Spring 2005

                                                                                        3 Transport Layer 52Comp 361 Spring 2005

                                                                                        Selective Repeat

                                                                                        receiver individually acknowledges all correctly received pkts

                                                                                        buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                        sender only resends pkts for which ACK not received

                                                                                        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                        3 Transport Layer 53Comp 361 Spring 2005

                                                                                        Selective repeat sender receiver windows

                                                                                        3 Transport Layer 54Comp 361 Spring 2005

                                                                                        Selective repeat

                                                                                        pkt n in [rcvbase rcvbase+N-1]

                                                                                        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                        pkt n in [rcvbase-Nrcvbase-1]

                                                                                        ACK(n) (note this is a reACK)

                                                                                        otherwiseignore

                                                                                        receiverdata from above

                                                                                        if next available seq in window send pkt

                                                                                        timeout(n)resend pkt n restart timer

                                                                                        ACK(n) in [sendbasesendbase+N]

                                                                                        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                        sender

                                                                                        3 Transport Layer 55Comp 361 Spring 2005

                                                                                        Selective repeat in action

                                                                                        3 Transport Layer 56Comp 361 Spring 2005

                                                                                        Selective repeatdilemma

                                                                                        Example seq rsquos 0 1 2 3window size=3

                                                                                        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                        Q what is relationship between seq size and window size

                                                                                        3 Transport Layer 57Comp 361 Spring 2005

                                                                                        Chapter 3 outline

                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                        35 Connection-oriented transport TCP

                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                        3 Transport Layer 58Comp 361 Spring 2005

                                                                                        TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                        full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                        flow controlledsender will not overwhelm receiver

                                                                                        point-to-pointone sender one receiver

                                                                                        reliable in-order byte steam

                                                                                        no ldquomessage boundariesrdquopipelined

                                                                                        TCP congestion and flow control set window size

                                                                                        send amp receive buffers

                                                                                        socketdoor

                                                                                        TCPsend buffer

                                                                                        TCPreceive buffer

                                                                                        socketdoor

                                                                                        segment

                                                                                        applicationwrites data

                                                                                        applicationreads data

                                                                                        3 Transport Layer 59Comp 361 Spring 2005

                                                                                        More TCP DetailsMaximum Segment Size (MSS)

                                                                                        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                        Application Data + TCP Header = TCP Segment

                                                                                        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                        (again no payload)Client responds with third special segment

                                                                                        This can contain payload

                                                                                        3 Transport Layer 60Comp 361 Spring 2005

                                                                                        Even More TCP Details

                                                                                        A TCP connection between client and server creates in both client and server

                                                                                        (i) buffers(ii) variables and

                                                                                        (iii) a socket connection to process

                                                                                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                        any of the network elements between the host and server

                                                                                        3 Transport Layer 61Comp 361 Spring 2005

                                                                                        TCP segment structure

                                                                                        source port dest port

                                                                                        32 bits

                                                                                        applicationdata

                                                                                        (variable length)

                                                                                        sequence numberacknowledgement number

                                                                                        Receive windowUrg data pnterchecksum

                                                                                        FSRPAUheadlen

                                                                                        notused

                                                                                        Options (variable length)

                                                                                        URG urgent data (generally not used)

                                                                                        ACK ACK valid

                                                                                        PSH push data now(generally not used)

                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                        commands)

                                                                                        bytes rcvr willingto accept

                                                                                        Internetchecksum

                                                                                        (as in UDP)

                                                                                        countingby bytes of data(not segments)

                                                                                        3 Transport Layer 62Comp 361 Spring 2005

                                                                                        TCP seq rsquos and ACKsSeq rsquos

                                                                                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                        ACKsseq of next byte expected from other sidecumulative ACK

                                                                                        Q how receiver handles out-of-order segments

                                                                                        A TCP spec doesnrsquot say - up to implementer

                                                                                        Host BHost A

                                                                                        Seq=42 ACK=79 data = lsquoCrsquo

                                                                                        Seq=79 ACK=43 data = lsquoCrsquo

                                                                                        Seq=43 ACK=80

                                                                                        Usertypes

                                                                                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                        back lsquoCrsquo

                                                                                        host ACKsreceipt

                                                                                        of echoedlsquoCrsquo

                                                                                        timesimple telnet scenario

                                                                                        3 Transport Layer 63Comp 361 Spring 2005

                                                                                        TCP Round Trip Time and Timeout

                                                                                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                        average several recent measurements not just current SampleRTT

                                                                                        Q how to set TCP timeout valuelonger than RTT

                                                                                        but RTT variestoo short premature timeout

                                                                                        unnecessary retransmissions

                                                                                        too long slow reaction to segment loss

                                                                                        3 Transport Layer 64Comp 361 Spring 2005

                                                                                        TCP Round Trip Time and Timeout

                                                                                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                        3 Transport Layer 65Comp 361 Spring 2005

                                                                                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                        100

                                                                                        150

                                                                                        200

                                                                                        250

                                                                                        300

                                                                                        350

                                                                                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                        time (seconnds)

                                                                                        RTT

                                                                                        (mill

                                                                                        iseco

                                                                                        nds)

                                                                                        SampleRTT Estimated RTT

                                                                                        3 Transport Layer 66Comp 361 Spring 2005

                                                                                        TCP Round Trip Time and Timeout

                                                                                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                        (typically β = 025)

                                                                                        Then set timeout interval

                                                                                        TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                        3 Transport Layer 67Comp 361 Spring 2005

                                                                                        Chapter 3 outline

                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                        35 Connection-oriented transport TCP

                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                        3 Transport Layer 68Comp 361 Spring 2005

                                                                                        TCP reliable data transfer

                                                                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                        Retransmissions are triggered by

                                                                                        timeout eventsduplicate acks

                                                                                        Initially consider simplified TCP sender

                                                                                        ignore duplicate acksignore flow control congestion control

                                                                                        3 Transport Layer 69Comp 361 Spring 2005

                                                                                        TCP sender eventsdata rcvd from app

                                                                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                        timeoutretransmit segment that caused timeoutrestart timer

                                                                                        Ack rcvdIf acknowledges previously unackedsegments

                                                                                        update what is known to be ackedstart timer if there are outstanding segments

                                                                                        TCP sender(simplified)

                                                                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                        loop (forever) switch(event)

                                                                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                        smallest sequence numberstart timer

                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                        start timer

                                                                                        end of loop forever

                                                                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                        3 Transport Layer 70Comp 361 Spring 2005

                                                                                        3 Transport Layer 71Comp 361 Spring 2005

                                                                                        TCP retransmission scenariosHost A

                                                                                        Seq=100 20 bytes data

                                                                                        ACK=100

                                                                                        timepremature timeout

                                                                                        Host B

                                                                                        Seq=92 8 bytes data

                                                                                        ACK=120

                                                                                        Seq=92 8 bytes data

                                                                                        Seq=

                                                                                        92 t

                                                                                        imeo

                                                                                        ut

                                                                                        ACK=120

                                                                                        Host A

                                                                                        Seq=92 8 bytes data

                                                                                        ACK=100

                                                                                        loss

                                                                                        tim

                                                                                        eout

                                                                                        lost ACK scenario

                                                                                        Host B

                                                                                        X

                                                                                        Seq=92 8 bytes data

                                                                                        ACK=100

                                                                                        time

                                                                                        SendBase= 120

                                                                                        SendBase= 120

                                                                                        Sendbase= 100

                                                                                        Seq=

                                                                                        92 t

                                                                                        imeo

                                                                                        utSendBase

                                                                                        = 100

                                                                                        3 Transport Layer 72Comp 361 Spring 2005

                                                                                        TCP retransmission scenarios (more)Host A

                                                                                        Seq=92 8 bytes data

                                                                                        ACK=100

                                                                                        loss

                                                                                        tim

                                                                                        eout

                                                                                        Cumulative ACK scenario

                                                                                        Host B

                                                                                        X

                                                                                        Seq=100 20 bytes data

                                                                                        ACK=120

                                                                                        time

                                                                                        SendBase= 120

                                                                                        3 Transport Layer 73Comp 361 Spring 2005

                                                                                        TCP ACK generation [RFC 1122 RFC 2581]

                                                                                        Event at Receiver

                                                                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                        Arrival of segment that partially or completely fills gap

                                                                                        TCP Receiver action

                                                                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                        Immediately send single cumulative ACK ACKing both in-order segments

                                                                                        Immediately send duplicate ACK indicating seq of next expected byte

                                                                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                        3 Transport Layer 74Comp 361 Spring 2005

                                                                                        More on Sender Policies

                                                                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                        3 Transport Layer 75Comp 361 Spring 2005

                                                                                        Fast Retransmit

                                                                                        Time-out period often relatively long

                                                                                        long delay before resending lost packet

                                                                                        Detect lost segments via duplicate ACKs

                                                                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                        fast retransmit resend segment before timer expires

                                                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                                                        Fast retransmit algorithm

                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                        start timer

                                                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                        resend segment with sequence number y

                                                                                        a duplicate ACK for already ACKed segment

                                                                                        fast retransmit

                                                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                                                        TCP GBN or Selective Repeat

                                                                                        Basic TCP looks a lot like GBN

                                                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                        This looks a lot like Selective Repeat

                                                                                        TCP is a hybrid

                                                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                                                        Chapter 3 outline

                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                        35 Connection-oriented transport TCP

                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                                                        TCP Flow Control

                                                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                        transmitting too muchtoo fast

                                                                                        flow controlreceive side of TCP connection has a receive buffer

                                                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                        app process may be slow at reading from buffer

                                                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                                                        TCP segment structure

                                                                                        source port dest port

                                                                                        32 bits

                                                                                        applicationdata

                                                                                        (variable length)

                                                                                        sequence numberacknowledgement number

                                                                                        Receive windowUrg data pnterchecksum

                                                                                        FSRPAUheadlen

                                                                                        notused

                                                                                        Options (variable length)

                                                                                        URG urgent data (generally not used)

                                                                                        ACK ACK valid

                                                                                        PSH push data now(generally not used)

                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                        commands)

                                                                                        bytes rcvr willingto accept

                                                                                        Internetchecksum

                                                                                        (as in UDP)

                                                                                        countingby bytes of data(not segments)

                                                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                                                        TCP Flow control how it works

                                                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                        LastByteRead]

                                                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                        guarantees receive buffer doesnrsquot overflow

                                                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                                                        Technical Issue

                                                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                                        Note on UDP

                                                                                        UDP has no flow control

                                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                                        Chapter 3 outline

                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                        35 Connection-oriented transport TCP

                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                                        TCP Connection Management

                                                                                        Three way handshakeStep 1 client end system sends

                                                                                        TCP SYN control segment to server

                                                                                        specifies client_isn the initial seq No application data

                                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                                        TCP Connection Management (cont)

                                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                        Allocate buffersAllocates buffersCan include application data

                                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                                        server

                                                                                        Connection granted (SYN=1 server_isn

                                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                                        ack=client_isn+1)

                                                                                        ack=server_isn+1

                                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                                        TCP Connection Management (cont)

                                                                                        Closing a connection

                                                                                        client closes socketclientSocketclose()

                                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                        client

                                                                                        FIN

                                                                                        server

                                                                                        ACK

                                                                                        ACK

                                                                                        FIN

                                                                                        close

                                                                                        close

                                                                                        closed

                                                                                        tim

                                                                                        ed w

                                                                                        ait

                                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                                        TCP Connection Management (cont)

                                                                                        Step 3 client receives FIN replies with ACK

                                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                        Closes down after timed-wait

                                                                                        Step 4 server receives ACK Connection closed

                                                                                        Note with small modification can handle simultaneous FINs

                                                                                        client

                                                                                        FIN

                                                                                        server

                                                                                        ACK

                                                                                        ACK

                                                                                        FIN

                                                                                        closing

                                                                                        closing

                                                                                        closed

                                                                                        tim

                                                                                        ed w

                                                                                        ait

                                                                                        closed

                                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                                        TCP Connection Management (cont)

                                                                                        ExampleTCP serverlifecycle

                                                                                        Example TCP clientlifecycle

                                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                                        A few special cases

                                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                        Chapter 3 outline

                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                        35 Connection-oriented transport TCP

                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                        Principles of Congestion Control

                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                        a top-10 problem

                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                        large delays when congestedmaximum achievable throughput

                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                        Causescosts of congestion scenario 2

                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                        λin λout=

                                                                                        λin λoutgtλ

                                                                                        inλout

                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                        (c)(a) (b)

                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                        λin

                                                                                        Q what happens as and increase λ

                                                                                        in

                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                        Causescosts of congestion scenario 3

                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                        Approaches towards congestion control

                                                                                        Two broad approaches towards congestion control

                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                        Case study ATM ABR congestion control

                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                        small exception ndash see next page

                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                        sender should use available bandwidth

                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                        Case study ATM ABR congestion control

                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                        Chapter 3 outline

                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                        35 Connection-oriented transport TCP

                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                        Congwin

                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                        throughput = w MSSRTT Bytessec

                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                        cut CongWin in half after loss event

                                                                                        8 Kbytes

                                                                                        16 Kbytes

                                                                                        24 Kbytes

                                                                                        time

                                                                                        congestionwindow

                                                                                        Long-lived TCP connection

                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                        TCP Slow Start

                                                                                        When connection begins CongWin = 1 MSS

                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                        desirable to quickly ramp up to respectable rate

                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                        TCP Slow Start (more)

                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                        Host A

                                                                                        one segment

                                                                                        RTT

                                                                                        Host B

                                                                                        time

                                                                                        two segments

                                                                                        four segments

                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                        Summary TCP Congestion Control

                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                        The Big Picture

                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                        ACK receipt for previously unackeddata

                                                                                        Slow Start (SS)

                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                        ACK receipt for previously unackeddata

                                                                                        CongestionAvoidance (CA)

                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                        Loss event detected by triple duplicate ACK

                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                        Enter slow start

                                                                                        Duplicate ACK

                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                        CongWin and Threshold not changed

                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                        TCP throughput

                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                        TCP Futures

                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                        LRTTMSSsdot221

                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                        TCP connection 1

                                                                                        bottleneckrouter

                                                                                        capacity R

                                                                                        TCP connection 2

                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                        Why is TCP fairTwo competing sessions

                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                        R

                                                                                        R

                                                                                        equal bandwidth share

                                                                                        Connection 1 throughput

                                                                                        Conn

                                                                                        ecti

                                                                                        on 2

                                                                                        thr

                                                                                        ough

                                                                                        p ut

                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                        Fairness (more)Fairness and UDP

                                                                                        Multimedia apps often do not use TCP

                                                                                        do not want rate throttled by congestion control

                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                        TCP Latency ModelingNotation assumptions

                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                        modeling slow start

                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                        Fixed Congestion Window (W)Two cases

                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                        Fixed congestion window (1)

                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                        latency = 2RTT + OR

                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                        Fixed congestion window (2)

                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                        Will show that the delay for one object is

                                                                                        RS

                                                                                        RSRTTP

                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                        ⎤⎢⎣⎡ +++=

                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                        - and K is the number of windows that cover the object

                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                        RTT

                                                                                        initiate TCPconnection

                                                                                        requestobject

                                                                                        first window= SR

                                                                                        second window= 2SR

                                                                                        third window= 4SR

                                                                                        fourth window= 8SR

                                                                                        completetransmissionobject

                                                                                        delivered

                                                                                        time atclient

                                                                                        time atserver

                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                        Server idles P=2 times

                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                        Server idles P = minK-1Q times

                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                        TCP Latency Modeling (3)

                                                                                        ementacknowledg receivesserver until

                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                        RS

                                                                                        RSRTTPRTT

                                                                                        RO

                                                                                        RSRTT

                                                                                        RSRTT

                                                                                        RO

                                                                                        idleTimeRTTRO

                                                                                        P

                                                                                        kP

                                                                                        k

                                                                                        P

                                                                                        pp

                                                                                        )12(][2

                                                                                        ]2[2

                                                                                        2delay

                                                                                        1

                                                                                        1

                                                                                        1

                                                                                        minusminus+++=

                                                                                        minus+++=

                                                                                        ++=

                                                                                        minus

                                                                                        =

                                                                                        =

                                                                                        sum

                                                                                        sum

                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                        RS k =⎥⎦

                                                                                        ⎤⎢⎣⎡ minus+

                                                                                        +minus

                                                                                        window kth the transmit totime2 1 =minus

                                                                                        RSk

                                                                                        RTT

                                                                                        initiate TCPconnection

                                                                                        requestobject

                                                                                        first window= SR

                                                                                        second window= 2SR

                                                                                        third window= 4SR

                                                                                        fourth window= 8SR

                                                                                        completetransmissionobject

                                                                                        delivered

                                                                                        time atclient

                                                                                        time atserver

                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                        How do we calculate K

                                                                                        ⎥⎥⎤

                                                                                        ⎢⎢⎡ +=

                                                                                        +ge=

                                                                                        geminus=

                                                                                        ge+++=

                                                                                        ge+++=minus

                                                                                        minus

                                                                                        )1(log

                                                                                        )1(logmin

                                                                                        12min

                                                                                        222min222min

                                                                                        2

                                                                                        2

                                                                                        110

                                                                                        110

                                                                                        SO

                                                                                        SOkk

                                                                                        SOk

                                                                                        SOkOSSSkK

                                                                                        k

                                                                                        k

                                                                                        k

                                                                                        L

                                                                                        L

                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                        HTTP ModelingAssume Web page consists of

                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                        02468

                                                                                        101214161820

                                                                                        28Kbps

                                                                                        100Kbps

                                                                                        1 Mbps 10Mbps

                                                                                        non-persistent

                                                                                        persistent

                                                                                        parallel non-persistent

                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                        HTTP Response time (in seconds)

                                                                                        0

                                                                                        10

                                                                                        20

                                                                                        30

                                                                                        40

                                                                                        50

                                                                                        60

                                                                                        70

                                                                                        28Kbps

                                                                                        100Kbps

                                                                                        1 Mbps 10Mbps

                                                                                        non-persistent

                                                                                        persistent

                                                                                        parallel non-persistent

                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                        instantiation and implementation in the Internet

                                                                                        UDPTCP

                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                        • Chapter 3 outline
                                                                                        • Transport services and protocols
                                                                                        • Transport vs network layer
                                                                                        • Transport-layer protocols
                                                                                        • Chapter 3 outline
                                                                                        • Multiplexingdemultiplexing
                                                                                        • Multiplexingdemultiplexing
                                                                                        • How demultiplexing works
                                                                                        • Connectionless demultiplexing
                                                                                        • Connectionless demux (cont)
                                                                                        • Connection-oriented demux
                                                                                        • Connection-oriented demux (cont)
                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                        • Chapter 3 outline
                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                        • UDP more
                                                                                        • UDP checksum
                                                                                        • Chapter 3 outline
                                                                                        • Principles of Reliable data transfer
                                                                                        • Reliable data transfer getting started
                                                                                        • Reliable data transfer getting started
                                                                                        • Incremental Improvements
                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                        • Rdt20 channel with bit errors
                                                                                        • rdt20 FSM specification
                                                                                        • rdt20 operation with no errors
                                                                                        • rdt20 error scenario
                                                                                        • rdt20 has a fatal flaw
                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                        • rdt21 discussion
                                                                                        • rdt22 a NAK-free protocol
                                                                                        • rdt22 sender receiver fragments
                                                                                        • rdt30 channels with errors and loss
                                                                                        • rdt30 sender
                                                                                        • rdt30 in action
                                                                                        • rdt30 in action
                                                                                        • Performance of rdt30
                                                                                        • rdt30 stop-and-wait operation
                                                                                        • Pipelined protocols
                                                                                        • Pipelined protocols
                                                                                        • Pipelining increased utilization
                                                                                        • Go-Back-N
                                                                                        • GBN Sender
                                                                                        • GBN sender extended FSM
                                                                                        • GBN receiver extended FSM
                                                                                        • More on receiver
                                                                                        • GBN inaction
                                                                                        • Selective Repeat
                                                                                        • Selective repeat sender receiver windows
                                                                                        • Selective repeat
                                                                                        • Selective repeat in action
                                                                                        • Selective repeat dilemma
                                                                                        • Chapter 3 outline
                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                        • More TCP Details
                                                                                        • Even More TCP Details
                                                                                        • TCP segment structure
                                                                                        • TCP seq rsquos and ACKs
                                                                                        • TCP Round Trip Time and Timeout
                                                                                        • TCP Round Trip Time and Timeout
                                                                                        • Example RTT estimation
                                                                                        • TCP Round Trip Time and Timeout
                                                                                        • Chapter 3 outline
                                                                                        • TCP reliable data transfer
                                                                                        • TCP sender events
                                                                                        • TCP sender(simplified)
                                                                                        • TCP retransmission scenarios
                                                                                        • TCP retransmission scenarios (more)
                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                        • More on Sender Policies
                                                                                        • Fast Retransmit
                                                                                        • Fast retransmit algorithm
                                                                                        • TCP GBN or Selective Repeat
                                                                                        • Chapter 3 outline
                                                                                        • TCP Flow Control
                                                                                        • TCP Flow Control
                                                                                        • TCP segment structure
                                                                                        • TCP Flow control how it works
                                                                                        • Technical Issue
                                                                                        • Chapter 3 outline
                                                                                        • TCP Connection Management
                                                                                        • TCP Connection Management (cont)
                                                                                        • TCP Connection Management (cont)
                                                                                        • TCP Connection Management (cont)
                                                                                        • TCP Connection Management (cont)
                                                                                        • A few special cases
                                                                                        • Chapter 3 outline
                                                                                        • Principles of Congestion Control
                                                                                        • Causescosts of congestion scenario 1
                                                                                        • Causescosts of congestion scenario 2
                                                                                        • Causescosts of congestion scenario 3
                                                                                        • Causescosts of congestion scenario 3
                                                                                        • Approaches towards congestion control
                                                                                        • Case study ATM ABR congestion control
                                                                                        • Case study ATM ABR congestion control
                                                                                        • Chapter 3 outline
                                                                                        • TCP Congestion Control
                                                                                        • TCP AIMD
                                                                                        • TCP Slow Start
                                                                                        • TCP Slow Start (more)
                                                                                        • Summary TCP Congestion Control
                                                                                        • The Big Picture
                                                                                        • TCP sender congestion control
                                                                                        • TCP throughput
                                                                                        • TCP Futures
                                                                                        • TCP Fairness
                                                                                        • Why is TCP fair
                                                                                        • Fairness (more)
                                                                                        • TCP Latency Modeling
                                                                                        • Fixed Congestion Window (W)
                                                                                        • Fixed congestion window (1)
                                                                                        • Fixed congestion window (2)
                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                        • TCP Latency Modeling (3)
                                                                                        • TCP Latency Modeling (4)
                                                                                        • HTTP Modeling
                                                                                        • Chapter 3 Summary

                                                                                          3 Transport Layer 45Comp 361 Spring 2005

                                                                                          Go-Back-NSender

                                                                                          k-bit seq in pkt headerldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed

                                                                                          ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquomay receive duplicate ACKs (see receiver)

                                                                                          Only one timer for oldest unacknowledged pkttimeout(n) retransmit pkt n and all higher seq pkts in windowCalled a sliding-window protocol

                                                                                          3 Transport Layer 46Comp 361 Spring 2005

                                                                                          GBN Sender

                                                                                          rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                          Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                          Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                          This is only event that triggers resend

                                                                                          3 Transport Layer 47Comp 361 Spring 2005

                                                                                          GBN sender extended FSMrdt_send(data)

                                                                                          Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                          timeout

                                                                                          if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                          start_timernextseqnum++

                                                                                          elserefuse_data(data)

                                                                                          base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                          stop_timerelse

                                                                                          start_timer

                                                                                          rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                          base=1nextseqnum=1

                                                                                          rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                          Λ

                                                                                          3 Transport Layer 48Comp 361 Spring 2005

                                                                                          GBN receiver extended FSM

                                                                                          Wait

                                                                                          udt_send(sndpkt)default

                                                                                          rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                          extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                          expectedseqnum=1sndpkt =

                                                                                          make_pkt(0ACKchksum)

                                                                                          Λ

                                                                                          If expected packet receivedSend ACK and deliver packet upstairs

                                                                                          If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                          3 Transport Layer 49Comp 361 Spring 2005

                                                                                          More on receiver

                                                                                          The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                          3 Transport Layer 50Comp 361 Spring 2005

                                                                                          GBN inaction

                                                                                          GBN is easy to code but might have performance problems

                                                                                          In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                          Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                          3 Transport Layer 51Comp 361 Spring 2005

                                                                                          3 Transport Layer 52Comp 361 Spring 2005

                                                                                          Selective Repeat

                                                                                          receiver individually acknowledges all correctly received pkts

                                                                                          buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                          sender only resends pkts for which ACK not received

                                                                                          sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                          sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                          3 Transport Layer 53Comp 361 Spring 2005

                                                                                          Selective repeat sender receiver windows

                                                                                          3 Transport Layer 54Comp 361 Spring 2005

                                                                                          Selective repeat

                                                                                          pkt n in [rcvbase rcvbase+N-1]

                                                                                          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                          pkt n in [rcvbase-Nrcvbase-1]

                                                                                          ACK(n) (note this is a reACK)

                                                                                          otherwiseignore

                                                                                          receiverdata from above

                                                                                          if next available seq in window send pkt

                                                                                          timeout(n)resend pkt n restart timer

                                                                                          ACK(n) in [sendbasesendbase+N]

                                                                                          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                          sender

                                                                                          3 Transport Layer 55Comp 361 Spring 2005

                                                                                          Selective repeat in action

                                                                                          3 Transport Layer 56Comp 361 Spring 2005

                                                                                          Selective repeatdilemma

                                                                                          Example seq rsquos 0 1 2 3window size=3

                                                                                          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                          Q what is relationship between seq size and window size

                                                                                          3 Transport Layer 57Comp 361 Spring 2005

                                                                                          Chapter 3 outline

                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                          35 Connection-oriented transport TCP

                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                          3 Transport Layer 58Comp 361 Spring 2005

                                                                                          TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                          full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                          flow controlledsender will not overwhelm receiver

                                                                                          point-to-pointone sender one receiver

                                                                                          reliable in-order byte steam

                                                                                          no ldquomessage boundariesrdquopipelined

                                                                                          TCP congestion and flow control set window size

                                                                                          send amp receive buffers

                                                                                          socketdoor

                                                                                          TCPsend buffer

                                                                                          TCPreceive buffer

                                                                                          socketdoor

                                                                                          segment

                                                                                          applicationwrites data

                                                                                          applicationreads data

                                                                                          3 Transport Layer 59Comp 361 Spring 2005

                                                                                          More TCP DetailsMaximum Segment Size (MSS)

                                                                                          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                          Application Data + TCP Header = TCP Segment

                                                                                          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                          (again no payload)Client responds with third special segment

                                                                                          This can contain payload

                                                                                          3 Transport Layer 60Comp 361 Spring 2005

                                                                                          Even More TCP Details

                                                                                          A TCP connection between client and server creates in both client and server

                                                                                          (i) buffers(ii) variables and

                                                                                          (iii) a socket connection to process

                                                                                          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                          any of the network elements between the host and server

                                                                                          3 Transport Layer 61Comp 361 Spring 2005

                                                                                          TCP segment structure

                                                                                          source port dest port

                                                                                          32 bits

                                                                                          applicationdata

                                                                                          (variable length)

                                                                                          sequence numberacknowledgement number

                                                                                          Receive windowUrg data pnterchecksum

                                                                                          FSRPAUheadlen

                                                                                          notused

                                                                                          Options (variable length)

                                                                                          URG urgent data (generally not used)

                                                                                          ACK ACK valid

                                                                                          PSH push data now(generally not used)

                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                          commands)

                                                                                          bytes rcvr willingto accept

                                                                                          Internetchecksum

                                                                                          (as in UDP)

                                                                                          countingby bytes of data(not segments)

                                                                                          3 Transport Layer 62Comp 361 Spring 2005

                                                                                          TCP seq rsquos and ACKsSeq rsquos

                                                                                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                          ACKsseq of next byte expected from other sidecumulative ACK

                                                                                          Q how receiver handles out-of-order segments

                                                                                          A TCP spec doesnrsquot say - up to implementer

                                                                                          Host BHost A

                                                                                          Seq=42 ACK=79 data = lsquoCrsquo

                                                                                          Seq=79 ACK=43 data = lsquoCrsquo

                                                                                          Seq=43 ACK=80

                                                                                          Usertypes

                                                                                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                          back lsquoCrsquo

                                                                                          host ACKsreceipt

                                                                                          of echoedlsquoCrsquo

                                                                                          timesimple telnet scenario

                                                                                          3 Transport Layer 63Comp 361 Spring 2005

                                                                                          TCP Round Trip Time and Timeout

                                                                                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                          average several recent measurements not just current SampleRTT

                                                                                          Q how to set TCP timeout valuelonger than RTT

                                                                                          but RTT variestoo short premature timeout

                                                                                          unnecessary retransmissions

                                                                                          too long slow reaction to segment loss

                                                                                          3 Transport Layer 64Comp 361 Spring 2005

                                                                                          TCP Round Trip Time and Timeout

                                                                                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                          3 Transport Layer 65Comp 361 Spring 2005

                                                                                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                          100

                                                                                          150

                                                                                          200

                                                                                          250

                                                                                          300

                                                                                          350

                                                                                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                          time (seconnds)

                                                                                          RTT

                                                                                          (mill

                                                                                          iseco

                                                                                          nds)

                                                                                          SampleRTT Estimated RTT

                                                                                          3 Transport Layer 66Comp 361 Spring 2005

                                                                                          TCP Round Trip Time and Timeout

                                                                                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                          (typically β = 025)

                                                                                          Then set timeout interval

                                                                                          TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                          3 Transport Layer 67Comp 361 Spring 2005

                                                                                          Chapter 3 outline

                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                          35 Connection-oriented transport TCP

                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                          3 Transport Layer 68Comp 361 Spring 2005

                                                                                          TCP reliable data transfer

                                                                                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                          Retransmissions are triggered by

                                                                                          timeout eventsduplicate acks

                                                                                          Initially consider simplified TCP sender

                                                                                          ignore duplicate acksignore flow control congestion control

                                                                                          3 Transport Layer 69Comp 361 Spring 2005

                                                                                          TCP sender eventsdata rcvd from app

                                                                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                          timeoutretransmit segment that caused timeoutrestart timer

                                                                                          Ack rcvdIf acknowledges previously unackedsegments

                                                                                          update what is known to be ackedstart timer if there are outstanding segments

                                                                                          TCP sender(simplified)

                                                                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                          loop (forever) switch(event)

                                                                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                          smallest sequence numberstart timer

                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                          start timer

                                                                                          end of loop forever

                                                                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                          3 Transport Layer 70Comp 361 Spring 2005

                                                                                          3 Transport Layer 71Comp 361 Spring 2005

                                                                                          TCP retransmission scenariosHost A

                                                                                          Seq=100 20 bytes data

                                                                                          ACK=100

                                                                                          timepremature timeout

                                                                                          Host B

                                                                                          Seq=92 8 bytes data

                                                                                          ACK=120

                                                                                          Seq=92 8 bytes data

                                                                                          Seq=

                                                                                          92 t

                                                                                          imeo

                                                                                          ut

                                                                                          ACK=120

                                                                                          Host A

                                                                                          Seq=92 8 bytes data

                                                                                          ACK=100

                                                                                          loss

                                                                                          tim

                                                                                          eout

                                                                                          lost ACK scenario

                                                                                          Host B

                                                                                          X

                                                                                          Seq=92 8 bytes data

                                                                                          ACK=100

                                                                                          time

                                                                                          SendBase= 120

                                                                                          SendBase= 120

                                                                                          Sendbase= 100

                                                                                          Seq=

                                                                                          92 t

                                                                                          imeo

                                                                                          utSendBase

                                                                                          = 100

                                                                                          3 Transport Layer 72Comp 361 Spring 2005

                                                                                          TCP retransmission scenarios (more)Host A

                                                                                          Seq=92 8 bytes data

                                                                                          ACK=100

                                                                                          loss

                                                                                          tim

                                                                                          eout

                                                                                          Cumulative ACK scenario

                                                                                          Host B

                                                                                          X

                                                                                          Seq=100 20 bytes data

                                                                                          ACK=120

                                                                                          time

                                                                                          SendBase= 120

                                                                                          3 Transport Layer 73Comp 361 Spring 2005

                                                                                          TCP ACK generation [RFC 1122 RFC 2581]

                                                                                          Event at Receiver

                                                                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                          Arrival of segment that partially or completely fills gap

                                                                                          TCP Receiver action

                                                                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                          Immediately send single cumulative ACK ACKing both in-order segments

                                                                                          Immediately send duplicate ACK indicating seq of next expected byte

                                                                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                          3 Transport Layer 74Comp 361 Spring 2005

                                                                                          More on Sender Policies

                                                                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                          3 Transport Layer 75Comp 361 Spring 2005

                                                                                          Fast Retransmit

                                                                                          Time-out period often relatively long

                                                                                          long delay before resending lost packet

                                                                                          Detect lost segments via duplicate ACKs

                                                                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                          fast retransmit resend segment before timer expires

                                                                                          3 Transport Layer 76Comp 361 Spring 2005

                                                                                          Fast retransmit algorithm

                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                          start timer

                                                                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                          resend segment with sequence number y

                                                                                          a duplicate ACK for already ACKed segment

                                                                                          fast retransmit

                                                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                                                          TCP GBN or Selective Repeat

                                                                                          Basic TCP looks a lot like GBN

                                                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                          This looks a lot like Selective Repeat

                                                                                          TCP is a hybrid

                                                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                                                          Chapter 3 outline

                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                          35 Connection-oriented transport TCP

                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                                                          TCP Flow Control

                                                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                          transmitting too muchtoo fast

                                                                                          flow controlreceive side of TCP connection has a receive buffer

                                                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                          app process may be slow at reading from buffer

                                                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                                                          TCP segment structure

                                                                                          source port dest port

                                                                                          32 bits

                                                                                          applicationdata

                                                                                          (variable length)

                                                                                          sequence numberacknowledgement number

                                                                                          Receive windowUrg data pnterchecksum

                                                                                          FSRPAUheadlen

                                                                                          notused

                                                                                          Options (variable length)

                                                                                          URG urgent data (generally not used)

                                                                                          ACK ACK valid

                                                                                          PSH push data now(generally not used)

                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                          commands)

                                                                                          bytes rcvr willingto accept

                                                                                          Internetchecksum

                                                                                          (as in UDP)

                                                                                          countingby bytes of data(not segments)

                                                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                                                          TCP Flow control how it works

                                                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                          LastByteRead]

                                                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                          guarantees receive buffer doesnrsquot overflow

                                                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                                                          Technical Issue

                                                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                                                          Note on UDP

                                                                                          UDP has no flow control

                                                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                                          Chapter 3 outline

                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                          35 Connection-oriented transport TCP

                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                                          TCP Connection Management

                                                                                          Three way handshakeStep 1 client end system sends

                                                                                          TCP SYN control segment to server

                                                                                          specifies client_isn the initial seq No application data

                                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                                          TCP Connection Management (cont)

                                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                          Allocate buffersAllocates buffersCan include application data

                                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                                          server

                                                                                          Connection granted (SYN=1 server_isn

                                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                                          ack=client_isn+1)

                                                                                          ack=server_isn+1

                                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                                          TCP Connection Management (cont)

                                                                                          Closing a connection

                                                                                          client closes socketclientSocketclose()

                                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                          client

                                                                                          FIN

                                                                                          server

                                                                                          ACK

                                                                                          ACK

                                                                                          FIN

                                                                                          close

                                                                                          close

                                                                                          closed

                                                                                          tim

                                                                                          ed w

                                                                                          ait

                                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                                          TCP Connection Management (cont)

                                                                                          Step 3 client receives FIN replies with ACK

                                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                          Closes down after timed-wait

                                                                                          Step 4 server receives ACK Connection closed

                                                                                          Note with small modification can handle simultaneous FINs

                                                                                          client

                                                                                          FIN

                                                                                          server

                                                                                          ACK

                                                                                          ACK

                                                                                          FIN

                                                                                          closing

                                                                                          closing

                                                                                          closed

                                                                                          tim

                                                                                          ed w

                                                                                          ait

                                                                                          closed

                                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                                          TCP Connection Management (cont)

                                                                                          ExampleTCP serverlifecycle

                                                                                          Example TCP clientlifecycle

                                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                                          A few special cases

                                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                                          Chapter 3 outline

                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                          35 Connection-oriented transport TCP

                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                          Principles of Congestion Control

                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                          a top-10 problem

                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                          large delays when congestedmaximum achievable throughput

                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                          Causescosts of congestion scenario 2

                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                          λin λout=

                                                                                          λin λoutgtλ

                                                                                          inλout

                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                          (c)(a) (b)

                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                          λin

                                                                                          Q what happens as and increase λ

                                                                                          in

                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                          Causescosts of congestion scenario 3

                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                          Approaches towards congestion control

                                                                                          Two broad approaches towards congestion control

                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                          Case study ATM ABR congestion control

                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                          small exception ndash see next page

                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                          sender should use available bandwidth

                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                          Case study ATM ABR congestion control

                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                          Chapter 3 outline

                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                          35 Connection-oriented transport TCP

                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                          Congwin

                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                          throughput = w MSSRTT Bytessec

                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                          cut CongWin in half after loss event

                                                                                          8 Kbytes

                                                                                          16 Kbytes

                                                                                          24 Kbytes

                                                                                          time

                                                                                          congestionwindow

                                                                                          Long-lived TCP connection

                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                          TCP Slow Start

                                                                                          When connection begins CongWin = 1 MSS

                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                          desirable to quickly ramp up to respectable rate

                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                          TCP Slow Start (more)

                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                          Host A

                                                                                          one segment

                                                                                          RTT

                                                                                          Host B

                                                                                          time

                                                                                          two segments

                                                                                          four segments

                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                          Summary TCP Congestion Control

                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                          The Big Picture

                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                          ACK receipt for previously unackeddata

                                                                                          Slow Start (SS)

                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                          ACK receipt for previously unackeddata

                                                                                          CongestionAvoidance (CA)

                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                          Loss event detected by triple duplicate ACK

                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                          Enter slow start

                                                                                          Duplicate ACK

                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                          CongWin and Threshold not changed

                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                          TCP throughput

                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                          TCP Futures

                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                          LRTTMSSsdot221

                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                          TCP connection 1

                                                                                          bottleneckrouter

                                                                                          capacity R

                                                                                          TCP connection 2

                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                          Why is TCP fairTwo competing sessions

                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                          R

                                                                                          R

                                                                                          equal bandwidth share

                                                                                          Connection 1 throughput

                                                                                          Conn

                                                                                          ecti

                                                                                          on 2

                                                                                          thr

                                                                                          ough

                                                                                          p ut

                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                          Fairness (more)Fairness and UDP

                                                                                          Multimedia apps often do not use TCP

                                                                                          do not want rate throttled by congestion control

                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                          TCP Latency ModelingNotation assumptions

                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                          modeling slow start

                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                          Fixed Congestion Window (W)Two cases

                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                          Fixed congestion window (1)

                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                          latency = 2RTT + OR

                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                          Fixed congestion window (2)

                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                          Will show that the delay for one object is

                                                                                          RS

                                                                                          RSRTTP

                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                          ⎤⎢⎣⎡ +++=

                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                          - and K is the number of windows that cover the object

                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                          RTT

                                                                                          initiate TCPconnection

                                                                                          requestobject

                                                                                          first window= SR

                                                                                          second window= 2SR

                                                                                          third window= 4SR

                                                                                          fourth window= 8SR

                                                                                          completetransmissionobject

                                                                                          delivered

                                                                                          time atclient

                                                                                          time atserver

                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                          Server idles P=2 times

                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                          Server idles P = minK-1Q times

                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                          TCP Latency Modeling (3)

                                                                                          ementacknowledg receivesserver until

                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                          RS

                                                                                          RSRTTPRTT

                                                                                          RO

                                                                                          RSRTT

                                                                                          RSRTT

                                                                                          RO

                                                                                          idleTimeRTTRO

                                                                                          P

                                                                                          kP

                                                                                          k

                                                                                          P

                                                                                          pp

                                                                                          )12(][2

                                                                                          ]2[2

                                                                                          2delay

                                                                                          1

                                                                                          1

                                                                                          1

                                                                                          minusminus+++=

                                                                                          minus+++=

                                                                                          ++=

                                                                                          minus

                                                                                          =

                                                                                          =

                                                                                          sum

                                                                                          sum

                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                          RS k =⎥⎦

                                                                                          ⎤⎢⎣⎡ minus+

                                                                                          +minus

                                                                                          window kth the transmit totime2 1 =minus

                                                                                          RSk

                                                                                          RTT

                                                                                          initiate TCPconnection

                                                                                          requestobject

                                                                                          first window= SR

                                                                                          second window= 2SR

                                                                                          third window= 4SR

                                                                                          fourth window= 8SR

                                                                                          completetransmissionobject

                                                                                          delivered

                                                                                          time atclient

                                                                                          time atserver

                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                          How do we calculate K

                                                                                          ⎥⎥⎤

                                                                                          ⎢⎢⎡ +=

                                                                                          +ge=

                                                                                          geminus=

                                                                                          ge+++=

                                                                                          ge+++=minus

                                                                                          minus

                                                                                          )1(log

                                                                                          )1(logmin

                                                                                          12min

                                                                                          222min222min

                                                                                          2

                                                                                          2

                                                                                          110

                                                                                          110

                                                                                          SO

                                                                                          SOkk

                                                                                          SOk

                                                                                          SOkOSSSkK

                                                                                          k

                                                                                          k

                                                                                          k

                                                                                          L

                                                                                          L

                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                          HTTP ModelingAssume Web page consists of

                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                          02468

                                                                                          101214161820

                                                                                          28Kbps

                                                                                          100Kbps

                                                                                          1 Mbps 10Mbps

                                                                                          non-persistent

                                                                                          persistent

                                                                                          parallel non-persistent

                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                          HTTP Response time (in seconds)

                                                                                          0

                                                                                          10

                                                                                          20

                                                                                          30

                                                                                          40

                                                                                          50

                                                                                          60

                                                                                          70

                                                                                          28Kbps

                                                                                          100Kbps

                                                                                          1 Mbps 10Mbps

                                                                                          non-persistent

                                                                                          persistent

                                                                                          parallel non-persistent

                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                          instantiation and implementation in the Internet

                                                                                          UDPTCP

                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                          • Chapter 3 outline
                                                                                          • Transport services and protocols
                                                                                          • Transport vs network layer
                                                                                          • Transport-layer protocols
                                                                                          • Chapter 3 outline
                                                                                          • Multiplexingdemultiplexing
                                                                                          • Multiplexingdemultiplexing
                                                                                          • How demultiplexing works
                                                                                          • Connectionless demultiplexing
                                                                                          • Connectionless demux (cont)
                                                                                          • Connection-oriented demux
                                                                                          • Connection-oriented demux (cont)
                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                          • Chapter 3 outline
                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                          • UDP more
                                                                                          • UDP checksum
                                                                                          • Chapter 3 outline
                                                                                          • Principles of Reliable data transfer
                                                                                          • Reliable data transfer getting started
                                                                                          • Reliable data transfer getting started
                                                                                          • Incremental Improvements
                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                          • Rdt20 channel with bit errors
                                                                                          • rdt20 FSM specification
                                                                                          • rdt20 operation with no errors
                                                                                          • rdt20 error scenario
                                                                                          • rdt20 has a fatal flaw
                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                          • rdt21 discussion
                                                                                          • rdt22 a NAK-free protocol
                                                                                          • rdt22 sender receiver fragments
                                                                                          • rdt30 channels with errors and loss
                                                                                          • rdt30 sender
                                                                                          • rdt30 in action
                                                                                          • rdt30 in action
                                                                                          • Performance of rdt30
                                                                                          • rdt30 stop-and-wait operation
                                                                                          • Pipelined protocols
                                                                                          • Pipelined protocols
                                                                                          • Pipelining increased utilization
                                                                                          • Go-Back-N
                                                                                          • GBN Sender
                                                                                          • GBN sender extended FSM
                                                                                          • GBN receiver extended FSM
                                                                                          • More on receiver
                                                                                          • GBN inaction
                                                                                          • Selective Repeat
                                                                                          • Selective repeat sender receiver windows
                                                                                          • Selective repeat
                                                                                          • Selective repeat in action
                                                                                          • Selective repeat dilemma
                                                                                          • Chapter 3 outline
                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                          • More TCP Details
                                                                                          • Even More TCP Details
                                                                                          • TCP segment structure
                                                                                          • TCP seq rsquos and ACKs
                                                                                          • TCP Round Trip Time and Timeout
                                                                                          • TCP Round Trip Time and Timeout
                                                                                          • Example RTT estimation
                                                                                          • TCP Round Trip Time and Timeout
                                                                                          • Chapter 3 outline
                                                                                          • TCP reliable data transfer
                                                                                          • TCP sender events
                                                                                          • TCP sender(simplified)
                                                                                          • TCP retransmission scenarios
                                                                                          • TCP retransmission scenarios (more)
                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                          • More on Sender Policies
                                                                                          • Fast Retransmit
                                                                                          • Fast retransmit algorithm
                                                                                          • TCP GBN or Selective Repeat
                                                                                          • Chapter 3 outline
                                                                                          • TCP Flow Control
                                                                                          • TCP Flow Control
                                                                                          • TCP segment structure
                                                                                          • TCP Flow control how it works
                                                                                          • Technical Issue
                                                                                          • Chapter 3 outline
                                                                                          • TCP Connection Management
                                                                                          • TCP Connection Management (cont)
                                                                                          • TCP Connection Management (cont)
                                                                                          • TCP Connection Management (cont)
                                                                                          • TCP Connection Management (cont)
                                                                                          • A few special cases
                                                                                          • Chapter 3 outline
                                                                                          • Principles of Congestion Control
                                                                                          • Causescosts of congestion scenario 1
                                                                                          • Causescosts of congestion scenario 2
                                                                                          • Causescosts of congestion scenario 3
                                                                                          • Causescosts of congestion scenario 3
                                                                                          • Approaches towards congestion control
                                                                                          • Case study ATM ABR congestion control
                                                                                          • Case study ATM ABR congestion control
                                                                                          • Chapter 3 outline
                                                                                          • TCP Congestion Control
                                                                                          • TCP AIMD
                                                                                          • TCP Slow Start
                                                                                          • TCP Slow Start (more)
                                                                                          • Summary TCP Congestion Control
                                                                                          • The Big Picture
                                                                                          • TCP sender congestion control
                                                                                          • TCP throughput
                                                                                          • TCP Futures
                                                                                          • TCP Fairness
                                                                                          • Why is TCP fair
                                                                                          • Fairness (more)
                                                                                          • TCP Latency Modeling
                                                                                          • Fixed Congestion Window (W)
                                                                                          • Fixed congestion window (1)
                                                                                          • Fixed congestion window (2)
                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                          • TCP Latency Modeling (3)
                                                                                          • TCP Latency Modeling (4)
                                                                                          • HTTP Modeling
                                                                                          • Chapter 3 Summary

                                                                                            3 Transport Layer 46Comp 361 Spring 2005

                                                                                            GBN Sender

                                                                                            rdt_Send() called checks to see if window is full No send out packetYes return data to application level

                                                                                            Receipt of ACK(n) cumulative acknowledgement that all packets up to and including n have been received Updates window accordingly and restarts timer

                                                                                            Timeout resends ALL packets that have been sent but not yet acknowledged

                                                                                            This is only event that triggers resend

                                                                                            3 Transport Layer 47Comp 361 Spring 2005

                                                                                            GBN sender extended FSMrdt_send(data)

                                                                                            Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                            timeout

                                                                                            if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                            start_timernextseqnum++

                                                                                            elserefuse_data(data)

                                                                                            base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                            stop_timerelse

                                                                                            start_timer

                                                                                            rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                            base=1nextseqnum=1

                                                                                            rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                            Λ

                                                                                            3 Transport Layer 48Comp 361 Spring 2005

                                                                                            GBN receiver extended FSM

                                                                                            Wait

                                                                                            udt_send(sndpkt)default

                                                                                            rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                            extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                            expectedseqnum=1sndpkt =

                                                                                            make_pkt(0ACKchksum)

                                                                                            Λ

                                                                                            If expected packet receivedSend ACK and deliver packet upstairs

                                                                                            If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                            3 Transport Layer 49Comp 361 Spring 2005

                                                                                            More on receiver

                                                                                            The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                            3 Transport Layer 50Comp 361 Spring 2005

                                                                                            GBN inaction

                                                                                            GBN is easy to code but might have performance problems

                                                                                            In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                            Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                            3 Transport Layer 51Comp 361 Spring 2005

                                                                                            3 Transport Layer 52Comp 361 Spring 2005

                                                                                            Selective Repeat

                                                                                            receiver individually acknowledges all correctly received pkts

                                                                                            buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                            sender only resends pkts for which ACK not received

                                                                                            sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                            sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                            3 Transport Layer 53Comp 361 Spring 2005

                                                                                            Selective repeat sender receiver windows

                                                                                            3 Transport Layer 54Comp 361 Spring 2005

                                                                                            Selective repeat

                                                                                            pkt n in [rcvbase rcvbase+N-1]

                                                                                            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                            pkt n in [rcvbase-Nrcvbase-1]

                                                                                            ACK(n) (note this is a reACK)

                                                                                            otherwiseignore

                                                                                            receiverdata from above

                                                                                            if next available seq in window send pkt

                                                                                            timeout(n)resend pkt n restart timer

                                                                                            ACK(n) in [sendbasesendbase+N]

                                                                                            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                            sender

                                                                                            3 Transport Layer 55Comp 361 Spring 2005

                                                                                            Selective repeat in action

                                                                                            3 Transport Layer 56Comp 361 Spring 2005

                                                                                            Selective repeatdilemma

                                                                                            Example seq rsquos 0 1 2 3window size=3

                                                                                            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                            Q what is relationship between seq size and window size

                                                                                            3 Transport Layer 57Comp 361 Spring 2005

                                                                                            Chapter 3 outline

                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                            35 Connection-oriented transport TCP

                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                            3 Transport Layer 58Comp 361 Spring 2005

                                                                                            TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                            full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                            flow controlledsender will not overwhelm receiver

                                                                                            point-to-pointone sender one receiver

                                                                                            reliable in-order byte steam

                                                                                            no ldquomessage boundariesrdquopipelined

                                                                                            TCP congestion and flow control set window size

                                                                                            send amp receive buffers

                                                                                            socketdoor

                                                                                            TCPsend buffer

                                                                                            TCPreceive buffer

                                                                                            socketdoor

                                                                                            segment

                                                                                            applicationwrites data

                                                                                            applicationreads data

                                                                                            3 Transport Layer 59Comp 361 Spring 2005

                                                                                            More TCP DetailsMaximum Segment Size (MSS)

                                                                                            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                            Application Data + TCP Header = TCP Segment

                                                                                            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                            (again no payload)Client responds with third special segment

                                                                                            This can contain payload

                                                                                            3 Transport Layer 60Comp 361 Spring 2005

                                                                                            Even More TCP Details

                                                                                            A TCP connection between client and server creates in both client and server

                                                                                            (i) buffers(ii) variables and

                                                                                            (iii) a socket connection to process

                                                                                            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                            any of the network elements between the host and server

                                                                                            3 Transport Layer 61Comp 361 Spring 2005

                                                                                            TCP segment structure

                                                                                            source port dest port

                                                                                            32 bits

                                                                                            applicationdata

                                                                                            (variable length)

                                                                                            sequence numberacknowledgement number

                                                                                            Receive windowUrg data pnterchecksum

                                                                                            FSRPAUheadlen

                                                                                            notused

                                                                                            Options (variable length)

                                                                                            URG urgent data (generally not used)

                                                                                            ACK ACK valid

                                                                                            PSH push data now(generally not used)

                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                            commands)

                                                                                            bytes rcvr willingto accept

                                                                                            Internetchecksum

                                                                                            (as in UDP)

                                                                                            countingby bytes of data(not segments)

                                                                                            3 Transport Layer 62Comp 361 Spring 2005

                                                                                            TCP seq rsquos and ACKsSeq rsquos

                                                                                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                            ACKsseq of next byte expected from other sidecumulative ACK

                                                                                            Q how receiver handles out-of-order segments

                                                                                            A TCP spec doesnrsquot say - up to implementer

                                                                                            Host BHost A

                                                                                            Seq=42 ACK=79 data = lsquoCrsquo

                                                                                            Seq=79 ACK=43 data = lsquoCrsquo

                                                                                            Seq=43 ACK=80

                                                                                            Usertypes

                                                                                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                            back lsquoCrsquo

                                                                                            host ACKsreceipt

                                                                                            of echoedlsquoCrsquo

                                                                                            timesimple telnet scenario

                                                                                            3 Transport Layer 63Comp 361 Spring 2005

                                                                                            TCP Round Trip Time and Timeout

                                                                                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                            average several recent measurements not just current SampleRTT

                                                                                            Q how to set TCP timeout valuelonger than RTT

                                                                                            but RTT variestoo short premature timeout

                                                                                            unnecessary retransmissions

                                                                                            too long slow reaction to segment loss

                                                                                            3 Transport Layer 64Comp 361 Spring 2005

                                                                                            TCP Round Trip Time and Timeout

                                                                                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                            3 Transport Layer 65Comp 361 Spring 2005

                                                                                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                            100

                                                                                            150

                                                                                            200

                                                                                            250

                                                                                            300

                                                                                            350

                                                                                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                            time (seconnds)

                                                                                            RTT

                                                                                            (mill

                                                                                            iseco

                                                                                            nds)

                                                                                            SampleRTT Estimated RTT

                                                                                            3 Transport Layer 66Comp 361 Spring 2005

                                                                                            TCP Round Trip Time and Timeout

                                                                                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                            (typically β = 025)

                                                                                            Then set timeout interval

                                                                                            TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                            3 Transport Layer 67Comp 361 Spring 2005

                                                                                            Chapter 3 outline

                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                            35 Connection-oriented transport TCP

                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                            3 Transport Layer 68Comp 361 Spring 2005

                                                                                            TCP reliable data transfer

                                                                                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                            Retransmissions are triggered by

                                                                                            timeout eventsduplicate acks

                                                                                            Initially consider simplified TCP sender

                                                                                            ignore duplicate acksignore flow control congestion control

                                                                                            3 Transport Layer 69Comp 361 Spring 2005

                                                                                            TCP sender eventsdata rcvd from app

                                                                                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                            timeoutretransmit segment that caused timeoutrestart timer

                                                                                            Ack rcvdIf acknowledges previously unackedsegments

                                                                                            update what is known to be ackedstart timer if there are outstanding segments

                                                                                            TCP sender(simplified)

                                                                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                            loop (forever) switch(event)

                                                                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                            smallest sequence numberstart timer

                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                            start timer

                                                                                            end of loop forever

                                                                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                            3 Transport Layer 70Comp 361 Spring 2005

                                                                                            3 Transport Layer 71Comp 361 Spring 2005

                                                                                            TCP retransmission scenariosHost A

                                                                                            Seq=100 20 bytes data

                                                                                            ACK=100

                                                                                            timepremature timeout

                                                                                            Host B

                                                                                            Seq=92 8 bytes data

                                                                                            ACK=120

                                                                                            Seq=92 8 bytes data

                                                                                            Seq=

                                                                                            92 t

                                                                                            imeo

                                                                                            ut

                                                                                            ACK=120

                                                                                            Host A

                                                                                            Seq=92 8 bytes data

                                                                                            ACK=100

                                                                                            loss

                                                                                            tim

                                                                                            eout

                                                                                            lost ACK scenario

                                                                                            Host B

                                                                                            X

                                                                                            Seq=92 8 bytes data

                                                                                            ACK=100

                                                                                            time

                                                                                            SendBase= 120

                                                                                            SendBase= 120

                                                                                            Sendbase= 100

                                                                                            Seq=

                                                                                            92 t

                                                                                            imeo

                                                                                            utSendBase

                                                                                            = 100

                                                                                            3 Transport Layer 72Comp 361 Spring 2005

                                                                                            TCP retransmission scenarios (more)Host A

                                                                                            Seq=92 8 bytes data

                                                                                            ACK=100

                                                                                            loss

                                                                                            tim

                                                                                            eout

                                                                                            Cumulative ACK scenario

                                                                                            Host B

                                                                                            X

                                                                                            Seq=100 20 bytes data

                                                                                            ACK=120

                                                                                            time

                                                                                            SendBase= 120

                                                                                            3 Transport Layer 73Comp 361 Spring 2005

                                                                                            TCP ACK generation [RFC 1122 RFC 2581]

                                                                                            Event at Receiver

                                                                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                            Arrival of segment that partially or completely fills gap

                                                                                            TCP Receiver action

                                                                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                            Immediately send single cumulative ACK ACKing both in-order segments

                                                                                            Immediately send duplicate ACK indicating seq of next expected byte

                                                                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                            3 Transport Layer 74Comp 361 Spring 2005

                                                                                            More on Sender Policies

                                                                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                            3 Transport Layer 75Comp 361 Spring 2005

                                                                                            Fast Retransmit

                                                                                            Time-out period often relatively long

                                                                                            long delay before resending lost packet

                                                                                            Detect lost segments via duplicate ACKs

                                                                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                            fast retransmit resend segment before timer expires

                                                                                            3 Transport Layer 76Comp 361 Spring 2005

                                                                                            Fast retransmit algorithm

                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                            start timer

                                                                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                            resend segment with sequence number y

                                                                                            a duplicate ACK for already ACKed segment

                                                                                            fast retransmit

                                                                                            3 Transport Layer 77Comp 361 Spring 2005

                                                                                            TCP GBN or Selective Repeat

                                                                                            Basic TCP looks a lot like GBN

                                                                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                            This looks a lot like Selective Repeat

                                                                                            TCP is a hybrid

                                                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                                                            Chapter 3 outline

                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                            35 Connection-oriented transport TCP

                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                                                            TCP Flow Control

                                                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                            transmitting too muchtoo fast

                                                                                            flow controlreceive side of TCP connection has a receive buffer

                                                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                            app process may be slow at reading from buffer

                                                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                                                            TCP segment structure

                                                                                            source port dest port

                                                                                            32 bits

                                                                                            applicationdata

                                                                                            (variable length)

                                                                                            sequence numberacknowledgement number

                                                                                            Receive windowUrg data pnterchecksum

                                                                                            FSRPAUheadlen

                                                                                            notused

                                                                                            Options (variable length)

                                                                                            URG urgent data (generally not used)

                                                                                            ACK ACK valid

                                                                                            PSH push data now(generally not used)

                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                            commands)

                                                                                            bytes rcvr willingto accept

                                                                                            Internetchecksum

                                                                                            (as in UDP)

                                                                                            countingby bytes of data(not segments)

                                                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                                                            TCP Flow control how it works

                                                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                            LastByteRead]

                                                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                            guarantees receive buffer doesnrsquot overflow

                                                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                                                            Technical Issue

                                                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                                                            Note on UDP

                                                                                            UDP has no flow control

                                                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                                                            Chapter 3 outline

                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                            35 Connection-oriented transport TCP

                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                                            TCP Connection Management

                                                                                            Three way handshakeStep 1 client end system sends

                                                                                            TCP SYN control segment to server

                                                                                            specifies client_isn the initial seq No application data

                                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                                            TCP Connection Management (cont)

                                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                            Allocate buffersAllocates buffersCan include application data

                                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                                            server

                                                                                            Connection granted (SYN=1 server_isn

                                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                                            ack=client_isn+1)

                                                                                            ack=server_isn+1

                                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                                            TCP Connection Management (cont)

                                                                                            Closing a connection

                                                                                            client closes socketclientSocketclose()

                                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                            client

                                                                                            FIN

                                                                                            server

                                                                                            ACK

                                                                                            ACK

                                                                                            FIN

                                                                                            close

                                                                                            close

                                                                                            closed

                                                                                            tim

                                                                                            ed w

                                                                                            ait

                                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                                            TCP Connection Management (cont)

                                                                                            Step 3 client receives FIN replies with ACK

                                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                            Closes down after timed-wait

                                                                                            Step 4 server receives ACK Connection closed

                                                                                            Note with small modification can handle simultaneous FINs

                                                                                            client

                                                                                            FIN

                                                                                            server

                                                                                            ACK

                                                                                            ACK

                                                                                            FIN

                                                                                            closing

                                                                                            closing

                                                                                            closed

                                                                                            tim

                                                                                            ed w

                                                                                            ait

                                                                                            closed

                                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                                            TCP Connection Management (cont)

                                                                                            ExampleTCP serverlifecycle

                                                                                            Example TCP clientlifecycle

                                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                                            A few special cases

                                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                                            Chapter 3 outline

                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                            35 Connection-oriented transport TCP

                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                                            Principles of Congestion Control

                                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                            a top-10 problem

                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                            large delays when congestedmaximum achievable throughput

                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                            Causescosts of congestion scenario 2

                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                            λin λout=

                                                                                            λin λoutgtλ

                                                                                            inλout

                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                            (c)(a) (b)

                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                            λin

                                                                                            Q what happens as and increase λ

                                                                                            in

                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                            Causescosts of congestion scenario 3

                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                            Approaches towards congestion control

                                                                                            Two broad approaches towards congestion control

                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                            Case study ATM ABR congestion control

                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                            small exception ndash see next page

                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                            sender should use available bandwidth

                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                            Case study ATM ABR congestion control

                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                            Chapter 3 outline

                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                            35 Connection-oriented transport TCP

                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                            Congwin

                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                            throughput = w MSSRTT Bytessec

                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                            cut CongWin in half after loss event

                                                                                            8 Kbytes

                                                                                            16 Kbytes

                                                                                            24 Kbytes

                                                                                            time

                                                                                            congestionwindow

                                                                                            Long-lived TCP connection

                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                            TCP Slow Start

                                                                                            When connection begins CongWin = 1 MSS

                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                            desirable to quickly ramp up to respectable rate

                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                            TCP Slow Start (more)

                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                            Host A

                                                                                            one segment

                                                                                            RTT

                                                                                            Host B

                                                                                            time

                                                                                            two segments

                                                                                            four segments

                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                            Summary TCP Congestion Control

                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                            The Big Picture

                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                            ACK receipt for previously unackeddata

                                                                                            Slow Start (SS)

                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                            ACK receipt for previously unackeddata

                                                                                            CongestionAvoidance (CA)

                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                            Loss event detected by triple duplicate ACK

                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                            Enter slow start

                                                                                            Duplicate ACK

                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                            CongWin and Threshold not changed

                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                            TCP throughput

                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                            TCP Futures

                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                            LRTTMSSsdot221

                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                            TCP connection 1

                                                                                            bottleneckrouter

                                                                                            capacity R

                                                                                            TCP connection 2

                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                            Why is TCP fairTwo competing sessions

                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                            R

                                                                                            R

                                                                                            equal bandwidth share

                                                                                            Connection 1 throughput

                                                                                            Conn

                                                                                            ecti

                                                                                            on 2

                                                                                            thr

                                                                                            ough

                                                                                            p ut

                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                            Fairness (more)Fairness and UDP

                                                                                            Multimedia apps often do not use TCP

                                                                                            do not want rate throttled by congestion control

                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                            TCP Latency ModelingNotation assumptions

                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                            modeling slow start

                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                            Fixed Congestion Window (W)Two cases

                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                            Fixed congestion window (1)

                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                            latency = 2RTT + OR

                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                            Fixed congestion window (2)

                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                            Will show that the delay for one object is

                                                                                            RS

                                                                                            RSRTTP

                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                            ⎤⎢⎣⎡ +++=

                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                            - and K is the number of windows that cover the object

                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                            RTT

                                                                                            initiate TCPconnection

                                                                                            requestobject

                                                                                            first window= SR

                                                                                            second window= 2SR

                                                                                            third window= 4SR

                                                                                            fourth window= 8SR

                                                                                            completetransmissionobject

                                                                                            delivered

                                                                                            time atclient

                                                                                            time atserver

                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                            Server idles P=2 times

                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                            Server idles P = minK-1Q times

                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                            TCP Latency Modeling (3)

                                                                                            ementacknowledg receivesserver until

                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                            RS

                                                                                            RSRTTPRTT

                                                                                            RO

                                                                                            RSRTT

                                                                                            RSRTT

                                                                                            RO

                                                                                            idleTimeRTTRO

                                                                                            P

                                                                                            kP

                                                                                            k

                                                                                            P

                                                                                            pp

                                                                                            )12(][2

                                                                                            ]2[2

                                                                                            2delay

                                                                                            1

                                                                                            1

                                                                                            1

                                                                                            minusminus+++=

                                                                                            minus+++=

                                                                                            ++=

                                                                                            minus

                                                                                            =

                                                                                            =

                                                                                            sum

                                                                                            sum

                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                            RS k =⎥⎦

                                                                                            ⎤⎢⎣⎡ minus+

                                                                                            +minus

                                                                                            window kth the transmit totime2 1 =minus

                                                                                            RSk

                                                                                            RTT

                                                                                            initiate TCPconnection

                                                                                            requestobject

                                                                                            first window= SR

                                                                                            second window= 2SR

                                                                                            third window= 4SR

                                                                                            fourth window= 8SR

                                                                                            completetransmissionobject

                                                                                            delivered

                                                                                            time atclient

                                                                                            time atserver

                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                            How do we calculate K

                                                                                            ⎥⎥⎤

                                                                                            ⎢⎢⎡ +=

                                                                                            +ge=

                                                                                            geminus=

                                                                                            ge+++=

                                                                                            ge+++=minus

                                                                                            minus

                                                                                            )1(log

                                                                                            )1(logmin

                                                                                            12min

                                                                                            222min222min

                                                                                            2

                                                                                            2

                                                                                            110

                                                                                            110

                                                                                            SO

                                                                                            SOkk

                                                                                            SOk

                                                                                            SOkOSSSkK

                                                                                            k

                                                                                            k

                                                                                            k

                                                                                            L

                                                                                            L

                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                            HTTP ModelingAssume Web page consists of

                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                            02468

                                                                                            101214161820

                                                                                            28Kbps

                                                                                            100Kbps

                                                                                            1 Mbps 10Mbps

                                                                                            non-persistent

                                                                                            persistent

                                                                                            parallel non-persistent

                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                            HTTP Response time (in seconds)

                                                                                            0

                                                                                            10

                                                                                            20

                                                                                            30

                                                                                            40

                                                                                            50

                                                                                            60

                                                                                            70

                                                                                            28Kbps

                                                                                            100Kbps

                                                                                            1 Mbps 10Mbps

                                                                                            non-persistent

                                                                                            persistent

                                                                                            parallel non-persistent

                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                            instantiation and implementation in the Internet

                                                                                            UDPTCP

                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                            • Chapter 3 outline
                                                                                            • Transport services and protocols
                                                                                            • Transport vs network layer
                                                                                            • Transport-layer protocols
                                                                                            • Chapter 3 outline
                                                                                            • Multiplexingdemultiplexing
                                                                                            • Multiplexingdemultiplexing
                                                                                            • How demultiplexing works
                                                                                            • Connectionless demultiplexing
                                                                                            • Connectionless demux (cont)
                                                                                            • Connection-oriented demux
                                                                                            • Connection-oriented demux (cont)
                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                            • Chapter 3 outline
                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                            • UDP more
                                                                                            • UDP checksum
                                                                                            • Chapter 3 outline
                                                                                            • Principles of Reliable data transfer
                                                                                            • Reliable data transfer getting started
                                                                                            • Reliable data transfer getting started
                                                                                            • Incremental Improvements
                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                            • Rdt20 channel with bit errors
                                                                                            • rdt20 FSM specification
                                                                                            • rdt20 operation with no errors
                                                                                            • rdt20 error scenario
                                                                                            • rdt20 has a fatal flaw
                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                            • rdt21 discussion
                                                                                            • rdt22 a NAK-free protocol
                                                                                            • rdt22 sender receiver fragments
                                                                                            • rdt30 channels with errors and loss
                                                                                            • rdt30 sender
                                                                                            • rdt30 in action
                                                                                            • rdt30 in action
                                                                                            • Performance of rdt30
                                                                                            • rdt30 stop-and-wait operation
                                                                                            • Pipelined protocols
                                                                                            • Pipelined protocols
                                                                                            • Pipelining increased utilization
                                                                                            • Go-Back-N
                                                                                            • GBN Sender
                                                                                            • GBN sender extended FSM
                                                                                            • GBN receiver extended FSM
                                                                                            • More on receiver
                                                                                            • GBN inaction
                                                                                            • Selective Repeat
                                                                                            • Selective repeat sender receiver windows
                                                                                            • Selective repeat
                                                                                            • Selective repeat in action
                                                                                            • Selective repeat dilemma
                                                                                            • Chapter 3 outline
                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                            • More TCP Details
                                                                                            • Even More TCP Details
                                                                                            • TCP segment structure
                                                                                            • TCP seq rsquos and ACKs
                                                                                            • TCP Round Trip Time and Timeout
                                                                                            • TCP Round Trip Time and Timeout
                                                                                            • Example RTT estimation
                                                                                            • TCP Round Trip Time and Timeout
                                                                                            • Chapter 3 outline
                                                                                            • TCP reliable data transfer
                                                                                            • TCP sender events
                                                                                            • TCP sender(simplified)
                                                                                            • TCP retransmission scenarios
                                                                                            • TCP retransmission scenarios (more)
                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                            • More on Sender Policies
                                                                                            • Fast Retransmit
                                                                                            • Fast retransmit algorithm
                                                                                            • TCP GBN or Selective Repeat
                                                                                            • Chapter 3 outline
                                                                                            • TCP Flow Control
                                                                                            • TCP Flow Control
                                                                                            • TCP segment structure
                                                                                            • TCP Flow control how it works
                                                                                            • Technical Issue
                                                                                            • Chapter 3 outline
                                                                                            • TCP Connection Management
                                                                                            • TCP Connection Management (cont)
                                                                                            • TCP Connection Management (cont)
                                                                                            • TCP Connection Management (cont)
                                                                                            • TCP Connection Management (cont)
                                                                                            • A few special cases
                                                                                            • Chapter 3 outline
                                                                                            • Principles of Congestion Control
                                                                                            • Causescosts of congestion scenario 1
                                                                                            • Causescosts of congestion scenario 2
                                                                                            • Causescosts of congestion scenario 3
                                                                                            • Causescosts of congestion scenario 3
                                                                                            • Approaches towards congestion control
                                                                                            • Case study ATM ABR congestion control
                                                                                            • Case study ATM ABR congestion control
                                                                                            • Chapter 3 outline
                                                                                            • TCP Congestion Control
                                                                                            • TCP AIMD
                                                                                            • TCP Slow Start
                                                                                            • TCP Slow Start (more)
                                                                                            • Summary TCP Congestion Control
                                                                                            • The Big Picture
                                                                                            • TCP sender congestion control
                                                                                            • TCP throughput
                                                                                            • TCP Futures
                                                                                            • TCP Fairness
                                                                                            • Why is TCP fair
                                                                                            • Fairness (more)
                                                                                            • TCP Latency Modeling
                                                                                            • Fixed Congestion Window (W)
                                                                                            • Fixed congestion window (1)
                                                                                            • Fixed congestion window (2)
                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                            • TCP Latency Modeling (3)
                                                                                            • TCP Latency Modeling (4)
                                                                                            • HTTP Modeling
                                                                                            • Chapter 3 Summary

                                                                                              3 Transport Layer 47Comp 361 Spring 2005

                                                                                              GBN sender extended FSMrdt_send(data)

                                                                                              Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])hellipudt_send(sndpkt[nextseqnum-1])

                                                                                              timeout

                                                                                              if (nextseqnum lt base+N) sndpkt[nextseqnum] = make_pkt(nextseqnumdatachksum)udt_send(sndpkt[nextseqnum])if (base == nextseqnum)

                                                                                              start_timernextseqnum++

                                                                                              elserefuse_data(data)

                                                                                              base = getacknum(rcvpkt)+1If (base == nextseqnum)

                                                                                              stop_timerelse

                                                                                              start_timer

                                                                                              rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)

                                                                                              base=1nextseqnum=1

                                                                                              rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)

                                                                                              Λ

                                                                                              3 Transport Layer 48Comp 361 Spring 2005

                                                                                              GBN receiver extended FSM

                                                                                              Wait

                                                                                              udt_send(sndpkt)default

                                                                                              rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                              extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                              expectedseqnum=1sndpkt =

                                                                                              make_pkt(0ACKchksum)

                                                                                              Λ

                                                                                              If expected packet receivedSend ACK and deliver packet upstairs

                                                                                              If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                              3 Transport Layer 49Comp 361 Spring 2005

                                                                                              More on receiver

                                                                                              The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                              3 Transport Layer 50Comp 361 Spring 2005

                                                                                              GBN inaction

                                                                                              GBN is easy to code but might have performance problems

                                                                                              In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                              Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                              3 Transport Layer 51Comp 361 Spring 2005

                                                                                              3 Transport Layer 52Comp 361 Spring 2005

                                                                                              Selective Repeat

                                                                                              receiver individually acknowledges all correctly received pkts

                                                                                              buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                              sender only resends pkts for which ACK not received

                                                                                              sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                              sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                              3 Transport Layer 53Comp 361 Spring 2005

                                                                                              Selective repeat sender receiver windows

                                                                                              3 Transport Layer 54Comp 361 Spring 2005

                                                                                              Selective repeat

                                                                                              pkt n in [rcvbase rcvbase+N-1]

                                                                                              send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                              pkt n in [rcvbase-Nrcvbase-1]

                                                                                              ACK(n) (note this is a reACK)

                                                                                              otherwiseignore

                                                                                              receiverdata from above

                                                                                              if next available seq in window send pkt

                                                                                              timeout(n)resend pkt n restart timer

                                                                                              ACK(n) in [sendbasesendbase+N]

                                                                                              mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                              sender

                                                                                              3 Transport Layer 55Comp 361 Spring 2005

                                                                                              Selective repeat in action

                                                                                              3 Transport Layer 56Comp 361 Spring 2005

                                                                                              Selective repeatdilemma

                                                                                              Example seq rsquos 0 1 2 3window size=3

                                                                                              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                              Q what is relationship between seq size and window size

                                                                                              3 Transport Layer 57Comp 361 Spring 2005

                                                                                              Chapter 3 outline

                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                              35 Connection-oriented transport TCP

                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                              3 Transport Layer 58Comp 361 Spring 2005

                                                                                              TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                              full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                              flow controlledsender will not overwhelm receiver

                                                                                              point-to-pointone sender one receiver

                                                                                              reliable in-order byte steam

                                                                                              no ldquomessage boundariesrdquopipelined

                                                                                              TCP congestion and flow control set window size

                                                                                              send amp receive buffers

                                                                                              socketdoor

                                                                                              TCPsend buffer

                                                                                              TCPreceive buffer

                                                                                              socketdoor

                                                                                              segment

                                                                                              applicationwrites data

                                                                                              applicationreads data

                                                                                              3 Transport Layer 59Comp 361 Spring 2005

                                                                                              More TCP DetailsMaximum Segment Size (MSS)

                                                                                              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                              Application Data + TCP Header = TCP Segment

                                                                                              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                              (again no payload)Client responds with third special segment

                                                                                              This can contain payload

                                                                                              3 Transport Layer 60Comp 361 Spring 2005

                                                                                              Even More TCP Details

                                                                                              A TCP connection between client and server creates in both client and server

                                                                                              (i) buffers(ii) variables and

                                                                                              (iii) a socket connection to process

                                                                                              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                              any of the network elements between the host and server

                                                                                              3 Transport Layer 61Comp 361 Spring 2005

                                                                                              TCP segment structure

                                                                                              source port dest port

                                                                                              32 bits

                                                                                              applicationdata

                                                                                              (variable length)

                                                                                              sequence numberacknowledgement number

                                                                                              Receive windowUrg data pnterchecksum

                                                                                              FSRPAUheadlen

                                                                                              notused

                                                                                              Options (variable length)

                                                                                              URG urgent data (generally not used)

                                                                                              ACK ACK valid

                                                                                              PSH push data now(generally not used)

                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                              commands)

                                                                                              bytes rcvr willingto accept

                                                                                              Internetchecksum

                                                                                              (as in UDP)

                                                                                              countingby bytes of data(not segments)

                                                                                              3 Transport Layer 62Comp 361 Spring 2005

                                                                                              TCP seq rsquos and ACKsSeq rsquos

                                                                                              byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                              ACKsseq of next byte expected from other sidecumulative ACK

                                                                                              Q how receiver handles out-of-order segments

                                                                                              A TCP spec doesnrsquot say - up to implementer

                                                                                              Host BHost A

                                                                                              Seq=42 ACK=79 data = lsquoCrsquo

                                                                                              Seq=79 ACK=43 data = lsquoCrsquo

                                                                                              Seq=43 ACK=80

                                                                                              Usertypes

                                                                                              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                              back lsquoCrsquo

                                                                                              host ACKsreceipt

                                                                                              of echoedlsquoCrsquo

                                                                                              timesimple telnet scenario

                                                                                              3 Transport Layer 63Comp 361 Spring 2005

                                                                                              TCP Round Trip Time and Timeout

                                                                                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                              average several recent measurements not just current SampleRTT

                                                                                              Q how to set TCP timeout valuelonger than RTT

                                                                                              but RTT variestoo short premature timeout

                                                                                              unnecessary retransmissions

                                                                                              too long slow reaction to segment loss

                                                                                              3 Transport Layer 64Comp 361 Spring 2005

                                                                                              TCP Round Trip Time and Timeout

                                                                                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                              3 Transport Layer 65Comp 361 Spring 2005

                                                                                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                              100

                                                                                              150

                                                                                              200

                                                                                              250

                                                                                              300

                                                                                              350

                                                                                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                              time (seconnds)

                                                                                              RTT

                                                                                              (mill

                                                                                              iseco

                                                                                              nds)

                                                                                              SampleRTT Estimated RTT

                                                                                              3 Transport Layer 66Comp 361 Spring 2005

                                                                                              TCP Round Trip Time and Timeout

                                                                                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                              (typically β = 025)

                                                                                              Then set timeout interval

                                                                                              TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                              3 Transport Layer 67Comp 361 Spring 2005

                                                                                              Chapter 3 outline

                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                              35 Connection-oriented transport TCP

                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                              3 Transport Layer 68Comp 361 Spring 2005

                                                                                              TCP reliable data transfer

                                                                                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                              Retransmissions are triggered by

                                                                                              timeout eventsduplicate acks

                                                                                              Initially consider simplified TCP sender

                                                                                              ignore duplicate acksignore flow control congestion control

                                                                                              3 Transport Layer 69Comp 361 Spring 2005

                                                                                              TCP sender eventsdata rcvd from app

                                                                                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                              timeoutretransmit segment that caused timeoutrestart timer

                                                                                              Ack rcvdIf acknowledges previously unackedsegments

                                                                                              update what is known to be ackedstart timer if there are outstanding segments

                                                                                              TCP sender(simplified)

                                                                                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                              loop (forever) switch(event)

                                                                                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                              event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                              smallest sequence numberstart timer

                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                              start timer

                                                                                              end of loop forever

                                                                                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                              3 Transport Layer 70Comp 361 Spring 2005

                                                                                              3 Transport Layer 71Comp 361 Spring 2005

                                                                                              TCP retransmission scenariosHost A

                                                                                              Seq=100 20 bytes data

                                                                                              ACK=100

                                                                                              timepremature timeout

                                                                                              Host B

                                                                                              Seq=92 8 bytes data

                                                                                              ACK=120

                                                                                              Seq=92 8 bytes data

                                                                                              Seq=

                                                                                              92 t

                                                                                              imeo

                                                                                              ut

                                                                                              ACK=120

                                                                                              Host A

                                                                                              Seq=92 8 bytes data

                                                                                              ACK=100

                                                                                              loss

                                                                                              tim

                                                                                              eout

                                                                                              lost ACK scenario

                                                                                              Host B

                                                                                              X

                                                                                              Seq=92 8 bytes data

                                                                                              ACK=100

                                                                                              time

                                                                                              SendBase= 120

                                                                                              SendBase= 120

                                                                                              Sendbase= 100

                                                                                              Seq=

                                                                                              92 t

                                                                                              imeo

                                                                                              utSendBase

                                                                                              = 100

                                                                                              3 Transport Layer 72Comp 361 Spring 2005

                                                                                              TCP retransmission scenarios (more)Host A

                                                                                              Seq=92 8 bytes data

                                                                                              ACK=100

                                                                                              loss

                                                                                              tim

                                                                                              eout

                                                                                              Cumulative ACK scenario

                                                                                              Host B

                                                                                              X

                                                                                              Seq=100 20 bytes data

                                                                                              ACK=120

                                                                                              time

                                                                                              SendBase= 120

                                                                                              3 Transport Layer 73Comp 361 Spring 2005

                                                                                              TCP ACK generation [RFC 1122 RFC 2581]

                                                                                              Event at Receiver

                                                                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                              Arrival of segment that partially or completely fills gap

                                                                                              TCP Receiver action

                                                                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                              Immediately send single cumulative ACK ACKing both in-order segments

                                                                                              Immediately send duplicate ACK indicating seq of next expected byte

                                                                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                              3 Transport Layer 74Comp 361 Spring 2005

                                                                                              More on Sender Policies

                                                                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                              3 Transport Layer 75Comp 361 Spring 2005

                                                                                              Fast Retransmit

                                                                                              Time-out period often relatively long

                                                                                              long delay before resending lost packet

                                                                                              Detect lost segments via duplicate ACKs

                                                                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                              fast retransmit resend segment before timer expires

                                                                                              3 Transport Layer 76Comp 361 Spring 2005

                                                                                              Fast retransmit algorithm

                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                              start timer

                                                                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                              resend segment with sequence number y

                                                                                              a duplicate ACK for already ACKed segment

                                                                                              fast retransmit

                                                                                              3 Transport Layer 77Comp 361 Spring 2005

                                                                                              TCP GBN or Selective Repeat

                                                                                              Basic TCP looks a lot like GBN

                                                                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                              This looks a lot like Selective Repeat

                                                                                              TCP is a hybrid

                                                                                              3 Transport Layer 78Comp 361 Spring 2005

                                                                                              Chapter 3 outline

                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                              35 Connection-oriented transport TCP

                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                                                              TCP Flow Control

                                                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                              transmitting too muchtoo fast

                                                                                              flow controlreceive side of TCP connection has a receive buffer

                                                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                              app process may be slow at reading from buffer

                                                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                                                              TCP segment structure

                                                                                              source port dest port

                                                                                              32 bits

                                                                                              applicationdata

                                                                                              (variable length)

                                                                                              sequence numberacknowledgement number

                                                                                              Receive windowUrg data pnterchecksum

                                                                                              FSRPAUheadlen

                                                                                              notused

                                                                                              Options (variable length)

                                                                                              URG urgent data (generally not used)

                                                                                              ACK ACK valid

                                                                                              PSH push data now(generally not used)

                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                              commands)

                                                                                              bytes rcvr willingto accept

                                                                                              Internetchecksum

                                                                                              (as in UDP)

                                                                                              countingby bytes of data(not segments)

                                                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                                                              TCP Flow control how it works

                                                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                              LastByteRead]

                                                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                              guarantees receive buffer doesnrsquot overflow

                                                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                                                              Technical Issue

                                                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                                                              Note on UDP

                                                                                              UDP has no flow control

                                                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                                                              Chapter 3 outline

                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                              35 Connection-oriented transport TCP

                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                                                              TCP Connection Management

                                                                                              Three way handshakeStep 1 client end system sends

                                                                                              TCP SYN control segment to server

                                                                                              specifies client_isn the initial seq No application data

                                                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                              seq sbuffers flow control info (eg RcvWindow)

                                                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                                              TCP Connection Management (cont)

                                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                              Allocate buffersAllocates buffersCan include application data

                                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                                              server

                                                                                              Connection granted (SYN=1 server_isn

                                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                                              ack=client_isn+1)

                                                                                              ack=server_isn+1

                                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                                              TCP Connection Management (cont)

                                                                                              Closing a connection

                                                                                              client closes socketclientSocketclose()

                                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                              client

                                                                                              FIN

                                                                                              server

                                                                                              ACK

                                                                                              ACK

                                                                                              FIN

                                                                                              close

                                                                                              close

                                                                                              closed

                                                                                              tim

                                                                                              ed w

                                                                                              ait

                                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                                              TCP Connection Management (cont)

                                                                                              Step 3 client receives FIN replies with ACK

                                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                              Closes down after timed-wait

                                                                                              Step 4 server receives ACK Connection closed

                                                                                              Note with small modification can handle simultaneous FINs

                                                                                              client

                                                                                              FIN

                                                                                              server

                                                                                              ACK

                                                                                              ACK

                                                                                              FIN

                                                                                              closing

                                                                                              closing

                                                                                              closed

                                                                                              tim

                                                                                              ed w

                                                                                              ait

                                                                                              closed

                                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                                              TCP Connection Management (cont)

                                                                                              ExampleTCP serverlifecycle

                                                                                              Example TCP clientlifecycle

                                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                                              A few special cases

                                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                                              Chapter 3 outline

                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                              35 Connection-oriented transport TCP

                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                                              Principles of Congestion Control

                                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                              a top-10 problem

                                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                              large delays when congestedmaximum achievable throughput

                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                              Causescosts of congestion scenario 2

                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                              λin λout=

                                                                                              λin λoutgtλ

                                                                                              inλout

                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                              (c)(a) (b)

                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                              λin

                                                                                              Q what happens as and increase λ

                                                                                              in

                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                              Causescosts of congestion scenario 3

                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                              Approaches towards congestion control

                                                                                              Two broad approaches towards congestion control

                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                              Case study ATM ABR congestion control

                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                              small exception ndash see next page

                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                              sender should use available bandwidth

                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                              Case study ATM ABR congestion control

                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                              Chapter 3 outline

                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                              35 Connection-oriented transport TCP

                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                              Congwin

                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                              throughput = w MSSRTT Bytessec

                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                              cut CongWin in half after loss event

                                                                                              8 Kbytes

                                                                                              16 Kbytes

                                                                                              24 Kbytes

                                                                                              time

                                                                                              congestionwindow

                                                                                              Long-lived TCP connection

                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                              TCP Slow Start

                                                                                              When connection begins CongWin = 1 MSS

                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                              desirable to quickly ramp up to respectable rate

                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                              TCP Slow Start (more)

                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                              Host A

                                                                                              one segment

                                                                                              RTT

                                                                                              Host B

                                                                                              time

                                                                                              two segments

                                                                                              four segments

                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                              Summary TCP Congestion Control

                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                              The Big Picture

                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                              ACK receipt for previously unackeddata

                                                                                              Slow Start (SS)

                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                              ACK receipt for previously unackeddata

                                                                                              CongestionAvoidance (CA)

                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                              Loss event detected by triple duplicate ACK

                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                              Enter slow start

                                                                                              Duplicate ACK

                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                              CongWin and Threshold not changed

                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                              TCP throughput

                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                              TCP Futures

                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                              LRTTMSSsdot221

                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                              TCP connection 1

                                                                                              bottleneckrouter

                                                                                              capacity R

                                                                                              TCP connection 2

                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                              Why is TCP fairTwo competing sessions

                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                              R

                                                                                              R

                                                                                              equal bandwidth share

                                                                                              Connection 1 throughput

                                                                                              Conn

                                                                                              ecti

                                                                                              on 2

                                                                                              thr

                                                                                              ough

                                                                                              p ut

                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                              Fairness (more)Fairness and UDP

                                                                                              Multimedia apps often do not use TCP

                                                                                              do not want rate throttled by congestion control

                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                              TCP Latency ModelingNotation assumptions

                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                              modeling slow start

                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                              Fixed Congestion Window (W)Two cases

                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                              Fixed congestion window (1)

                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                              latency = 2RTT + OR

                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                              Fixed congestion window (2)

                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                              Will show that the delay for one object is

                                                                                              RS

                                                                                              RSRTTP

                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                              ⎤⎢⎣⎡ +++=

                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                              - and K is the number of windows that cover the object

                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                              RTT

                                                                                              initiate TCPconnection

                                                                                              requestobject

                                                                                              first window= SR

                                                                                              second window= 2SR

                                                                                              third window= 4SR

                                                                                              fourth window= 8SR

                                                                                              completetransmissionobject

                                                                                              delivered

                                                                                              time atclient

                                                                                              time atserver

                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                              Server idles P=2 times

                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                              Server idles P = minK-1Q times

                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                              TCP Latency Modeling (3)

                                                                                              ementacknowledg receivesserver until

                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                              RS

                                                                                              RSRTTPRTT

                                                                                              RO

                                                                                              RSRTT

                                                                                              RSRTT

                                                                                              RO

                                                                                              idleTimeRTTRO

                                                                                              P

                                                                                              kP

                                                                                              k

                                                                                              P

                                                                                              pp

                                                                                              )12(][2

                                                                                              ]2[2

                                                                                              2delay

                                                                                              1

                                                                                              1

                                                                                              1

                                                                                              minusminus+++=

                                                                                              minus+++=

                                                                                              ++=

                                                                                              minus

                                                                                              =

                                                                                              =

                                                                                              sum

                                                                                              sum

                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                              RS k =⎥⎦

                                                                                              ⎤⎢⎣⎡ minus+

                                                                                              +minus

                                                                                              window kth the transmit totime2 1 =minus

                                                                                              RSk

                                                                                              RTT

                                                                                              initiate TCPconnection

                                                                                              requestobject

                                                                                              first window= SR

                                                                                              second window= 2SR

                                                                                              third window= 4SR

                                                                                              fourth window= 8SR

                                                                                              completetransmissionobject

                                                                                              delivered

                                                                                              time atclient

                                                                                              time atserver

                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                              How do we calculate K

                                                                                              ⎥⎥⎤

                                                                                              ⎢⎢⎡ +=

                                                                                              +ge=

                                                                                              geminus=

                                                                                              ge+++=

                                                                                              ge+++=minus

                                                                                              minus

                                                                                              )1(log

                                                                                              )1(logmin

                                                                                              12min

                                                                                              222min222min

                                                                                              2

                                                                                              2

                                                                                              110

                                                                                              110

                                                                                              SO

                                                                                              SOkk

                                                                                              SOk

                                                                                              SOkOSSSkK

                                                                                              k

                                                                                              k

                                                                                              k

                                                                                              L

                                                                                              L

                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                              HTTP ModelingAssume Web page consists of

                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                              02468

                                                                                              101214161820

                                                                                              28Kbps

                                                                                              100Kbps

                                                                                              1 Mbps 10Mbps

                                                                                              non-persistent

                                                                                              persistent

                                                                                              parallel non-persistent

                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                              HTTP Response time (in seconds)

                                                                                              0

                                                                                              10

                                                                                              20

                                                                                              30

                                                                                              40

                                                                                              50

                                                                                              60

                                                                                              70

                                                                                              28Kbps

                                                                                              100Kbps

                                                                                              1 Mbps 10Mbps

                                                                                              non-persistent

                                                                                              persistent

                                                                                              parallel non-persistent

                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                              instantiation and implementation in the Internet

                                                                                              UDPTCP

                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                              • Chapter 3 outline
                                                                                              • Transport services and protocols
                                                                                              • Transport vs network layer
                                                                                              • Transport-layer protocols
                                                                                              • Chapter 3 outline
                                                                                              • Multiplexingdemultiplexing
                                                                                              • Multiplexingdemultiplexing
                                                                                              • How demultiplexing works
                                                                                              • Connectionless demultiplexing
                                                                                              • Connectionless demux (cont)
                                                                                              • Connection-oriented demux
                                                                                              • Connection-oriented demux (cont)
                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                              • Chapter 3 outline
                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                              • UDP more
                                                                                              • UDP checksum
                                                                                              • Chapter 3 outline
                                                                                              • Principles of Reliable data transfer
                                                                                              • Reliable data transfer getting started
                                                                                              • Reliable data transfer getting started
                                                                                              • Incremental Improvements
                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                              • Rdt20 channel with bit errors
                                                                                              • rdt20 FSM specification
                                                                                              • rdt20 operation with no errors
                                                                                              • rdt20 error scenario
                                                                                              • rdt20 has a fatal flaw
                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                              • rdt21 discussion
                                                                                              • rdt22 a NAK-free protocol
                                                                                              • rdt22 sender receiver fragments
                                                                                              • rdt30 channels with errors and loss
                                                                                              • rdt30 sender
                                                                                              • rdt30 in action
                                                                                              • rdt30 in action
                                                                                              • Performance of rdt30
                                                                                              • rdt30 stop-and-wait operation
                                                                                              • Pipelined protocols
                                                                                              • Pipelined protocols
                                                                                              • Pipelining increased utilization
                                                                                              • Go-Back-N
                                                                                              • GBN Sender
                                                                                              • GBN sender extended FSM
                                                                                              • GBN receiver extended FSM
                                                                                              • More on receiver
                                                                                              • GBN inaction
                                                                                              • Selective Repeat
                                                                                              • Selective repeat sender receiver windows
                                                                                              • Selective repeat
                                                                                              • Selective repeat in action
                                                                                              • Selective repeat dilemma
                                                                                              • Chapter 3 outline
                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                              • More TCP Details
                                                                                              • Even More TCP Details
                                                                                              • TCP segment structure
                                                                                              • TCP seq rsquos and ACKs
                                                                                              • TCP Round Trip Time and Timeout
                                                                                              • TCP Round Trip Time and Timeout
                                                                                              • Example RTT estimation
                                                                                              • TCP Round Trip Time and Timeout
                                                                                              • Chapter 3 outline
                                                                                              • TCP reliable data transfer
                                                                                              • TCP sender events
                                                                                              • TCP sender(simplified)
                                                                                              • TCP retransmission scenarios
                                                                                              • TCP retransmission scenarios (more)
                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                              • More on Sender Policies
                                                                                              • Fast Retransmit
                                                                                              • Fast retransmit algorithm
                                                                                              • TCP GBN or Selective Repeat
                                                                                              • Chapter 3 outline
                                                                                              • TCP Flow Control
                                                                                              • TCP Flow Control
                                                                                              • TCP segment structure
                                                                                              • TCP Flow control how it works
                                                                                              • Technical Issue
                                                                                              • Chapter 3 outline
                                                                                              • TCP Connection Management
                                                                                              • TCP Connection Management (cont)
                                                                                              • TCP Connection Management (cont)
                                                                                              • TCP Connection Management (cont)
                                                                                              • TCP Connection Management (cont)
                                                                                              • A few special cases
                                                                                              • Chapter 3 outline
                                                                                              • Principles of Congestion Control
                                                                                              • Causescosts of congestion scenario 1
                                                                                              • Causescosts of congestion scenario 2
                                                                                              • Causescosts of congestion scenario 3
                                                                                              • Causescosts of congestion scenario 3
                                                                                              • Approaches towards congestion control
                                                                                              • Case study ATM ABR congestion control
                                                                                              • Case study ATM ABR congestion control
                                                                                              • Chapter 3 outline
                                                                                              • TCP Congestion Control
                                                                                              • TCP AIMD
                                                                                              • TCP Slow Start
                                                                                              • TCP Slow Start (more)
                                                                                              • Summary TCP Congestion Control
                                                                                              • The Big Picture
                                                                                              • TCP sender congestion control
                                                                                              • TCP throughput
                                                                                              • TCP Futures
                                                                                              • TCP Fairness
                                                                                              • Why is TCP fair
                                                                                              • Fairness (more)
                                                                                              • TCP Latency Modeling
                                                                                              • Fixed Congestion Window (W)
                                                                                              • Fixed congestion window (1)
                                                                                              • Fixed congestion window (2)
                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                              • TCP Latency Modeling (3)
                                                                                              • TCP Latency Modeling (4)
                                                                                              • HTTP Modeling
                                                                                              • Chapter 3 Summary

                                                                                                3 Transport Layer 48Comp 361 Spring 2005

                                                                                                GBN receiver extended FSM

                                                                                                Wait

                                                                                                udt_send(sndpkt)default

                                                                                                rdt_rcv(rcvpkt)ampamp notcurrupt(rcvpkt)ampamp hasseqnum(rcvpktexpectedseqnum)

                                                                                                extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(expectedseqnumACKchksum)udt_send(sndpkt)expectedseqnum++

                                                                                                expectedseqnum=1sndpkt =

                                                                                                make_pkt(0ACKchksum)

                                                                                                Λ

                                                                                                If expected packet receivedSend ACK and deliver packet upstairs

                                                                                                If out-of-order packet received discard (donrsquot buffer) -gt no receiver bufferingRe-ACK pkt with highest in-order seq may generate duplicate ACKs

                                                                                                3 Transport Layer 49Comp 361 Spring 2005

                                                                                                More on receiver

                                                                                                The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                                3 Transport Layer 50Comp 361 Spring 2005

                                                                                                GBN inaction

                                                                                                GBN is easy to code but might have performance problems

                                                                                                In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                                Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                                3 Transport Layer 51Comp 361 Spring 2005

                                                                                                3 Transport Layer 52Comp 361 Spring 2005

                                                                                                Selective Repeat

                                                                                                receiver individually acknowledges all correctly received pkts

                                                                                                buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                                sender only resends pkts for which ACK not received

                                                                                                sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                                sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                                3 Transport Layer 53Comp 361 Spring 2005

                                                                                                Selective repeat sender receiver windows

                                                                                                3 Transport Layer 54Comp 361 Spring 2005

                                                                                                Selective repeat

                                                                                                pkt n in [rcvbase rcvbase+N-1]

                                                                                                send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                pkt n in [rcvbase-Nrcvbase-1]

                                                                                                ACK(n) (note this is a reACK)

                                                                                                otherwiseignore

                                                                                                receiverdata from above

                                                                                                if next available seq in window send pkt

                                                                                                timeout(n)resend pkt n restart timer

                                                                                                ACK(n) in [sendbasesendbase+N]

                                                                                                mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                sender

                                                                                                3 Transport Layer 55Comp 361 Spring 2005

                                                                                                Selective repeat in action

                                                                                                3 Transport Layer 56Comp 361 Spring 2005

                                                                                                Selective repeatdilemma

                                                                                                Example seq rsquos 0 1 2 3window size=3

                                                                                                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                Q what is relationship between seq size and window size

                                                                                                3 Transport Layer 57Comp 361 Spring 2005

                                                                                                Chapter 3 outline

                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                35 Connection-oriented transport TCP

                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                3 Transport Layer 58Comp 361 Spring 2005

                                                                                                TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                flow controlledsender will not overwhelm receiver

                                                                                                point-to-pointone sender one receiver

                                                                                                reliable in-order byte steam

                                                                                                no ldquomessage boundariesrdquopipelined

                                                                                                TCP congestion and flow control set window size

                                                                                                send amp receive buffers

                                                                                                socketdoor

                                                                                                TCPsend buffer

                                                                                                TCPreceive buffer

                                                                                                socketdoor

                                                                                                segment

                                                                                                applicationwrites data

                                                                                                applicationreads data

                                                                                                3 Transport Layer 59Comp 361 Spring 2005

                                                                                                More TCP DetailsMaximum Segment Size (MSS)

                                                                                                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                Application Data + TCP Header = TCP Segment

                                                                                                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                (again no payload)Client responds with third special segment

                                                                                                This can contain payload

                                                                                                3 Transport Layer 60Comp 361 Spring 2005

                                                                                                Even More TCP Details

                                                                                                A TCP connection between client and server creates in both client and server

                                                                                                (i) buffers(ii) variables and

                                                                                                (iii) a socket connection to process

                                                                                                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                any of the network elements between the host and server

                                                                                                3 Transport Layer 61Comp 361 Spring 2005

                                                                                                TCP segment structure

                                                                                                source port dest port

                                                                                                32 bits

                                                                                                applicationdata

                                                                                                (variable length)

                                                                                                sequence numberacknowledgement number

                                                                                                Receive windowUrg data pnterchecksum

                                                                                                FSRPAUheadlen

                                                                                                notused

                                                                                                Options (variable length)

                                                                                                URG urgent data (generally not used)

                                                                                                ACK ACK valid

                                                                                                PSH push data now(generally not used)

                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                commands)

                                                                                                bytes rcvr willingto accept

                                                                                                Internetchecksum

                                                                                                (as in UDP)

                                                                                                countingby bytes of data(not segments)

                                                                                                3 Transport Layer 62Comp 361 Spring 2005

                                                                                                TCP seq rsquos and ACKsSeq rsquos

                                                                                                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                Q how receiver handles out-of-order segments

                                                                                                A TCP spec doesnrsquot say - up to implementer

                                                                                                Host BHost A

                                                                                                Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                Seq=43 ACK=80

                                                                                                Usertypes

                                                                                                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                back lsquoCrsquo

                                                                                                host ACKsreceipt

                                                                                                of echoedlsquoCrsquo

                                                                                                timesimple telnet scenario

                                                                                                3 Transport Layer 63Comp 361 Spring 2005

                                                                                                TCP Round Trip Time and Timeout

                                                                                                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                average several recent measurements not just current SampleRTT

                                                                                                Q how to set TCP timeout valuelonger than RTT

                                                                                                but RTT variestoo short premature timeout

                                                                                                unnecessary retransmissions

                                                                                                too long slow reaction to segment loss

                                                                                                3 Transport Layer 64Comp 361 Spring 2005

                                                                                                TCP Round Trip Time and Timeout

                                                                                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                3 Transport Layer 65Comp 361 Spring 2005

                                                                                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                100

                                                                                                150

                                                                                                200

                                                                                                250

                                                                                                300

                                                                                                350

                                                                                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                time (seconnds)

                                                                                                RTT

                                                                                                (mill

                                                                                                iseco

                                                                                                nds)

                                                                                                SampleRTT Estimated RTT

                                                                                                3 Transport Layer 66Comp 361 Spring 2005

                                                                                                TCP Round Trip Time and Timeout

                                                                                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                (typically β = 025)

                                                                                                Then set timeout interval

                                                                                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                3 Transport Layer 67Comp 361 Spring 2005

                                                                                                Chapter 3 outline

                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                35 Connection-oriented transport TCP

                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                3 Transport Layer 68Comp 361 Spring 2005

                                                                                                TCP reliable data transfer

                                                                                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                Retransmissions are triggered by

                                                                                                timeout eventsduplicate acks

                                                                                                Initially consider simplified TCP sender

                                                                                                ignore duplicate acksignore flow control congestion control

                                                                                                3 Transport Layer 69Comp 361 Spring 2005

                                                                                                TCP sender eventsdata rcvd from app

                                                                                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                timeoutretransmit segment that caused timeoutrestart timer

                                                                                                Ack rcvdIf acknowledges previously unackedsegments

                                                                                                update what is known to be ackedstart timer if there are outstanding segments

                                                                                                TCP sender(simplified)

                                                                                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                loop (forever) switch(event)

                                                                                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                smallest sequence numberstart timer

                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                start timer

                                                                                                end of loop forever

                                                                                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                3 Transport Layer 70Comp 361 Spring 2005

                                                                                                3 Transport Layer 71Comp 361 Spring 2005

                                                                                                TCP retransmission scenariosHost A

                                                                                                Seq=100 20 bytes data

                                                                                                ACK=100

                                                                                                timepremature timeout

                                                                                                Host B

                                                                                                Seq=92 8 bytes data

                                                                                                ACK=120

                                                                                                Seq=92 8 bytes data

                                                                                                Seq=

                                                                                                92 t

                                                                                                imeo

                                                                                                ut

                                                                                                ACK=120

                                                                                                Host A

                                                                                                Seq=92 8 bytes data

                                                                                                ACK=100

                                                                                                loss

                                                                                                tim

                                                                                                eout

                                                                                                lost ACK scenario

                                                                                                Host B

                                                                                                X

                                                                                                Seq=92 8 bytes data

                                                                                                ACK=100

                                                                                                time

                                                                                                SendBase= 120

                                                                                                SendBase= 120

                                                                                                Sendbase= 100

                                                                                                Seq=

                                                                                                92 t

                                                                                                imeo

                                                                                                utSendBase

                                                                                                = 100

                                                                                                3 Transport Layer 72Comp 361 Spring 2005

                                                                                                TCP retransmission scenarios (more)Host A

                                                                                                Seq=92 8 bytes data

                                                                                                ACK=100

                                                                                                loss

                                                                                                tim

                                                                                                eout

                                                                                                Cumulative ACK scenario

                                                                                                Host B

                                                                                                X

                                                                                                Seq=100 20 bytes data

                                                                                                ACK=120

                                                                                                time

                                                                                                SendBase= 120

                                                                                                3 Transport Layer 73Comp 361 Spring 2005

                                                                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                Event at Receiver

                                                                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                Arrival of segment that partially or completely fills gap

                                                                                                TCP Receiver action

                                                                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                3 Transport Layer 74Comp 361 Spring 2005

                                                                                                More on Sender Policies

                                                                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                3 Transport Layer 75Comp 361 Spring 2005

                                                                                                Fast Retransmit

                                                                                                Time-out period often relatively long

                                                                                                long delay before resending lost packet

                                                                                                Detect lost segments via duplicate ACKs

                                                                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                fast retransmit resend segment before timer expires

                                                                                                3 Transport Layer 76Comp 361 Spring 2005

                                                                                                Fast retransmit algorithm

                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                start timer

                                                                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                resend segment with sequence number y

                                                                                                a duplicate ACK for already ACKed segment

                                                                                                fast retransmit

                                                                                                3 Transport Layer 77Comp 361 Spring 2005

                                                                                                TCP GBN or Selective Repeat

                                                                                                Basic TCP looks a lot like GBN

                                                                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                This looks a lot like Selective Repeat

                                                                                                TCP is a hybrid

                                                                                                3 Transport Layer 78Comp 361 Spring 2005

                                                                                                Chapter 3 outline

                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                35 Connection-oriented transport TCP

                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                3 Transport Layer 79Comp 361 Spring 2005

                                                                                                TCP Flow Control

                                                                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                transmitting too muchtoo fast

                                                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                app process may be slow at reading from buffer

                                                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                                                TCP segment structure

                                                                                                source port dest port

                                                                                                32 bits

                                                                                                applicationdata

                                                                                                (variable length)

                                                                                                sequence numberacknowledgement number

                                                                                                Receive windowUrg data pnterchecksum

                                                                                                FSRPAUheadlen

                                                                                                notused

                                                                                                Options (variable length)

                                                                                                URG urgent data (generally not used)

                                                                                                ACK ACK valid

                                                                                                PSH push data now(generally not used)

                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                commands)

                                                                                                bytes rcvr willingto accept

                                                                                                Internetchecksum

                                                                                                (as in UDP)

                                                                                                countingby bytes of data(not segments)

                                                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                                                TCP Flow control how it works

                                                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                LastByteRead]

                                                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                guarantees receive buffer doesnrsquot overflow

                                                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                                                Technical Issue

                                                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                                                Note on UDP

                                                                                                UDP has no flow control

                                                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                                                Chapter 3 outline

                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                35 Connection-oriented transport TCP

                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                                                TCP Connection Management

                                                                                                Three way handshakeStep 1 client end system sends

                                                                                                TCP SYN control segment to server

                                                                                                specifies client_isn the initial seq No application data

                                                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                                                TCP Connection Management (cont)

                                                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                Allocate buffersAllocates buffersCan include application data

                                                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                                                server

                                                                                                Connection granted (SYN=1 server_isn

                                                                                                ACK (SYN=0 seq=client_isn+1)

                                                                                                ack=client_isn+1)

                                                                                                ack=server_isn+1

                                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                                TCP Connection Management (cont)

                                                                                                Closing a connection

                                                                                                client closes socketclientSocketclose()

                                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                client

                                                                                                FIN

                                                                                                server

                                                                                                ACK

                                                                                                ACK

                                                                                                FIN

                                                                                                close

                                                                                                close

                                                                                                closed

                                                                                                tim

                                                                                                ed w

                                                                                                ait

                                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                                TCP Connection Management (cont)

                                                                                                Step 3 client receives FIN replies with ACK

                                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                Closes down after timed-wait

                                                                                                Step 4 server receives ACK Connection closed

                                                                                                Note with small modification can handle simultaneous FINs

                                                                                                client

                                                                                                FIN

                                                                                                server

                                                                                                ACK

                                                                                                ACK

                                                                                                FIN

                                                                                                closing

                                                                                                closing

                                                                                                closed

                                                                                                tim

                                                                                                ed w

                                                                                                ait

                                                                                                closed

                                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                                TCP Connection Management (cont)

                                                                                                ExampleTCP serverlifecycle

                                                                                                Example TCP clientlifecycle

                                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                                A few special cases

                                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                                Chapter 3 outline

                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                35 Connection-oriented transport TCP

                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                                Principles of Congestion Control

                                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                a top-10 problem

                                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                large delays when congestedmaximum achievable throughput

                                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                                Causescosts of congestion scenario 2

                                                                                                one router finite buffers sender retransmission of lost packet

                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                λin λout=

                                                                                                λin λoutgtλ

                                                                                                inλout

                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                (c)(a) (b)

                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                λin

                                                                                                Q what happens as and increase λ

                                                                                                in

                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                Causescosts of congestion scenario 3

                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                Approaches towards congestion control

                                                                                                Two broad approaches towards congestion control

                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                Case study ATM ABR congestion control

                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                small exception ndash see next page

                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                sender should use available bandwidth

                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                Case study ATM ABR congestion control

                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                Chapter 3 outline

                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                35 Connection-oriented transport TCP

                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                Congwin

                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                throughput = w MSSRTT Bytessec

                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                cut CongWin in half after loss event

                                                                                                8 Kbytes

                                                                                                16 Kbytes

                                                                                                24 Kbytes

                                                                                                time

                                                                                                congestionwindow

                                                                                                Long-lived TCP connection

                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                TCP Slow Start

                                                                                                When connection begins CongWin = 1 MSS

                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                TCP Slow Start (more)

                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                Host A

                                                                                                one segment

                                                                                                RTT

                                                                                                Host B

                                                                                                time

                                                                                                two segments

                                                                                                four segments

                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                Summary TCP Congestion Control

                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                The Big Picture

                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                ACK receipt for previously unackeddata

                                                                                                Slow Start (SS)

                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                ACK receipt for previously unackeddata

                                                                                                CongestionAvoidance (CA)

                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                Loss event detected by triple duplicate ACK

                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                Enter slow start

                                                                                                Duplicate ACK

                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                CongWin and Threshold not changed

                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                TCP throughput

                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                TCP Futures

                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                LRTTMSSsdot221

                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                TCP connection 1

                                                                                                bottleneckrouter

                                                                                                capacity R

                                                                                                TCP connection 2

                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                Why is TCP fairTwo competing sessions

                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                R

                                                                                                R

                                                                                                equal bandwidth share

                                                                                                Connection 1 throughput

                                                                                                Conn

                                                                                                ecti

                                                                                                on 2

                                                                                                thr

                                                                                                ough

                                                                                                p ut

                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                Fairness (more)Fairness and UDP

                                                                                                Multimedia apps often do not use TCP

                                                                                                do not want rate throttled by congestion control

                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                modeling slow start

                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                Fixed congestion window (1)

                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                latency = 2RTT + OR

                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                Fixed congestion window (2)

                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                Will show that the delay for one object is

                                                                                                RS

                                                                                                RSRTTP

                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                - and K is the number of windows that cover the object

                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                RTT

                                                                                                initiate TCPconnection

                                                                                                requestobject

                                                                                                first window= SR

                                                                                                second window= 2SR

                                                                                                third window= 4SR

                                                                                                fourth window= 8SR

                                                                                                completetransmissionobject

                                                                                                delivered

                                                                                                time atclient

                                                                                                time atserver

                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                Server idles P=2 times

                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                Server idles P = minK-1Q times

                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                TCP Latency Modeling (3)

                                                                                                ementacknowledg receivesserver until

                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                RS

                                                                                                RSRTTPRTT

                                                                                                RO

                                                                                                RSRTT

                                                                                                RSRTT

                                                                                                RO

                                                                                                idleTimeRTTRO

                                                                                                P

                                                                                                kP

                                                                                                k

                                                                                                P

                                                                                                pp

                                                                                                )12(][2

                                                                                                ]2[2

                                                                                                2delay

                                                                                                1

                                                                                                1

                                                                                                1

                                                                                                minusminus+++=

                                                                                                minus+++=

                                                                                                ++=

                                                                                                minus

                                                                                                =

                                                                                                =

                                                                                                sum

                                                                                                sum

                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                RS k =⎥⎦

                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                +minus

                                                                                                window kth the transmit totime2 1 =minus

                                                                                                RSk

                                                                                                RTT

                                                                                                initiate TCPconnection

                                                                                                requestobject

                                                                                                first window= SR

                                                                                                second window= 2SR

                                                                                                third window= 4SR

                                                                                                fourth window= 8SR

                                                                                                completetransmissionobject

                                                                                                delivered

                                                                                                time atclient

                                                                                                time atserver

                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                How do we calculate K

                                                                                                ⎥⎥⎤

                                                                                                ⎢⎢⎡ +=

                                                                                                +ge=

                                                                                                geminus=

                                                                                                ge+++=

                                                                                                ge+++=minus

                                                                                                minus

                                                                                                )1(log

                                                                                                )1(logmin

                                                                                                12min

                                                                                                222min222min

                                                                                                2

                                                                                                2

                                                                                                110

                                                                                                110

                                                                                                SO

                                                                                                SOkk

                                                                                                SOk

                                                                                                SOkOSSSkK

                                                                                                k

                                                                                                k

                                                                                                k

                                                                                                L

                                                                                                L

                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                02468

                                                                                                101214161820

                                                                                                28Kbps

                                                                                                100Kbps

                                                                                                1 Mbps 10Mbps

                                                                                                non-persistent

                                                                                                persistent

                                                                                                parallel non-persistent

                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                HTTP Response time (in seconds)

                                                                                                0

                                                                                                10

                                                                                                20

                                                                                                30

                                                                                                40

                                                                                                50

                                                                                                60

                                                                                                70

                                                                                                28Kbps

                                                                                                100Kbps

                                                                                                1 Mbps 10Mbps

                                                                                                non-persistent

                                                                                                persistent

                                                                                                parallel non-persistent

                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                instantiation and implementation in the Internet

                                                                                                UDPTCP

                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                • Chapter 3 outline
                                                                                                • Transport services and protocols
                                                                                                • Transport vs network layer
                                                                                                • Transport-layer protocols
                                                                                                • Chapter 3 outline
                                                                                                • Multiplexingdemultiplexing
                                                                                                • Multiplexingdemultiplexing
                                                                                                • How demultiplexing works
                                                                                                • Connectionless demultiplexing
                                                                                                • Connectionless demux (cont)
                                                                                                • Connection-oriented demux
                                                                                                • Connection-oriented demux (cont)
                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                • Chapter 3 outline
                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                • UDP more
                                                                                                • UDP checksum
                                                                                                • Chapter 3 outline
                                                                                                • Principles of Reliable data transfer
                                                                                                • Reliable data transfer getting started
                                                                                                • Reliable data transfer getting started
                                                                                                • Incremental Improvements
                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                • Rdt20 channel with bit errors
                                                                                                • rdt20 FSM specification
                                                                                                • rdt20 operation with no errors
                                                                                                • rdt20 error scenario
                                                                                                • rdt20 has a fatal flaw
                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                • rdt21 discussion
                                                                                                • rdt22 a NAK-free protocol
                                                                                                • rdt22 sender receiver fragments
                                                                                                • rdt30 channels with errors and loss
                                                                                                • rdt30 sender
                                                                                                • rdt30 in action
                                                                                                • rdt30 in action
                                                                                                • Performance of rdt30
                                                                                                • rdt30 stop-and-wait operation
                                                                                                • Pipelined protocols
                                                                                                • Pipelined protocols
                                                                                                • Pipelining increased utilization
                                                                                                • Go-Back-N
                                                                                                • GBN Sender
                                                                                                • GBN sender extended FSM
                                                                                                • GBN receiver extended FSM
                                                                                                • More on receiver
                                                                                                • GBN inaction
                                                                                                • Selective Repeat
                                                                                                • Selective repeat sender receiver windows
                                                                                                • Selective repeat
                                                                                                • Selective repeat in action
                                                                                                • Selective repeat dilemma
                                                                                                • Chapter 3 outline
                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                • More TCP Details
                                                                                                • Even More TCP Details
                                                                                                • TCP segment structure
                                                                                                • TCP seq rsquos and ACKs
                                                                                                • TCP Round Trip Time and Timeout
                                                                                                • TCP Round Trip Time and Timeout
                                                                                                • Example RTT estimation
                                                                                                • TCP Round Trip Time and Timeout
                                                                                                • Chapter 3 outline
                                                                                                • TCP reliable data transfer
                                                                                                • TCP sender events
                                                                                                • TCP sender(simplified)
                                                                                                • TCP retransmission scenarios
                                                                                                • TCP retransmission scenarios (more)
                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                • More on Sender Policies
                                                                                                • Fast Retransmit
                                                                                                • Fast retransmit algorithm
                                                                                                • TCP GBN or Selective Repeat
                                                                                                • Chapter 3 outline
                                                                                                • TCP Flow Control
                                                                                                • TCP Flow Control
                                                                                                • TCP segment structure
                                                                                                • TCP Flow control how it works
                                                                                                • Technical Issue
                                                                                                • Chapter 3 outline
                                                                                                • TCP Connection Management
                                                                                                • TCP Connection Management (cont)
                                                                                                • TCP Connection Management (cont)
                                                                                                • TCP Connection Management (cont)
                                                                                                • TCP Connection Management (cont)
                                                                                                • A few special cases
                                                                                                • Chapter 3 outline
                                                                                                • Principles of Congestion Control
                                                                                                • Causescosts of congestion scenario 1
                                                                                                • Causescosts of congestion scenario 2
                                                                                                • Causescosts of congestion scenario 3
                                                                                                • Causescosts of congestion scenario 3
                                                                                                • Approaches towards congestion control
                                                                                                • Case study ATM ABR congestion control
                                                                                                • Case study ATM ABR congestion control
                                                                                                • Chapter 3 outline
                                                                                                • TCP Congestion Control
                                                                                                • TCP AIMD
                                                                                                • TCP Slow Start
                                                                                                • TCP Slow Start (more)
                                                                                                • Summary TCP Congestion Control
                                                                                                • The Big Picture
                                                                                                • TCP sender congestion control
                                                                                                • TCP throughput
                                                                                                • TCP Futures
                                                                                                • TCP Fairness
                                                                                                • Why is TCP fair
                                                                                                • Fairness (more)
                                                                                                • TCP Latency Modeling
                                                                                                • Fixed Congestion Window (W)
                                                                                                • Fixed congestion window (1)
                                                                                                • Fixed congestion window (2)
                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                • TCP Latency Modeling (3)
                                                                                                • TCP Latency Modeling (4)
                                                                                                • HTTP Modeling
                                                                                                • Chapter 3 Summary

                                                                                                  3 Transport Layer 49Comp 361 Spring 2005

                                                                                                  More on receiver

                                                                                                  The receiver always sends ACK for last correctly received packet with highest in-order seq Receiver only sends ACKS (no NAKs)Can generate duplicate ACKsneed only remember expectedseqnum

                                                                                                  3 Transport Layer 50Comp 361 Spring 2005

                                                                                                  GBN inaction

                                                                                                  GBN is easy to code but might have performance problems

                                                                                                  In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                                  Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                                  3 Transport Layer 51Comp 361 Spring 2005

                                                                                                  3 Transport Layer 52Comp 361 Spring 2005

                                                                                                  Selective Repeat

                                                                                                  receiver individually acknowledges all correctly received pkts

                                                                                                  buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                                  sender only resends pkts for which ACK not received

                                                                                                  sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                                  sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                                  3 Transport Layer 53Comp 361 Spring 2005

                                                                                                  Selective repeat sender receiver windows

                                                                                                  3 Transport Layer 54Comp 361 Spring 2005

                                                                                                  Selective repeat

                                                                                                  pkt n in [rcvbase rcvbase+N-1]

                                                                                                  send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                  pkt n in [rcvbase-Nrcvbase-1]

                                                                                                  ACK(n) (note this is a reACK)

                                                                                                  otherwiseignore

                                                                                                  receiverdata from above

                                                                                                  if next available seq in window send pkt

                                                                                                  timeout(n)resend pkt n restart timer

                                                                                                  ACK(n) in [sendbasesendbase+N]

                                                                                                  mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                  sender

                                                                                                  3 Transport Layer 55Comp 361 Spring 2005

                                                                                                  Selective repeat in action

                                                                                                  3 Transport Layer 56Comp 361 Spring 2005

                                                                                                  Selective repeatdilemma

                                                                                                  Example seq rsquos 0 1 2 3window size=3

                                                                                                  receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                  Q what is relationship between seq size and window size

                                                                                                  3 Transport Layer 57Comp 361 Spring 2005

                                                                                                  Chapter 3 outline

                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                  35 Connection-oriented transport TCP

                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                  3 Transport Layer 58Comp 361 Spring 2005

                                                                                                  TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                  flow controlledsender will not overwhelm receiver

                                                                                                  point-to-pointone sender one receiver

                                                                                                  reliable in-order byte steam

                                                                                                  no ldquomessage boundariesrdquopipelined

                                                                                                  TCP congestion and flow control set window size

                                                                                                  send amp receive buffers

                                                                                                  socketdoor

                                                                                                  TCPsend buffer

                                                                                                  TCPreceive buffer

                                                                                                  socketdoor

                                                                                                  segment

                                                                                                  applicationwrites data

                                                                                                  applicationreads data

                                                                                                  3 Transport Layer 59Comp 361 Spring 2005

                                                                                                  More TCP DetailsMaximum Segment Size (MSS)

                                                                                                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                  Application Data + TCP Header = TCP Segment

                                                                                                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                  (again no payload)Client responds with third special segment

                                                                                                  This can contain payload

                                                                                                  3 Transport Layer 60Comp 361 Spring 2005

                                                                                                  Even More TCP Details

                                                                                                  A TCP connection between client and server creates in both client and server

                                                                                                  (i) buffers(ii) variables and

                                                                                                  (iii) a socket connection to process

                                                                                                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                  any of the network elements between the host and server

                                                                                                  3 Transport Layer 61Comp 361 Spring 2005

                                                                                                  TCP segment structure

                                                                                                  source port dest port

                                                                                                  32 bits

                                                                                                  applicationdata

                                                                                                  (variable length)

                                                                                                  sequence numberacknowledgement number

                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                  FSRPAUheadlen

                                                                                                  notused

                                                                                                  Options (variable length)

                                                                                                  URG urgent data (generally not used)

                                                                                                  ACK ACK valid

                                                                                                  PSH push data now(generally not used)

                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                  commands)

                                                                                                  bytes rcvr willingto accept

                                                                                                  Internetchecksum

                                                                                                  (as in UDP)

                                                                                                  countingby bytes of data(not segments)

                                                                                                  3 Transport Layer 62Comp 361 Spring 2005

                                                                                                  TCP seq rsquos and ACKsSeq rsquos

                                                                                                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                  ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                  Q how receiver handles out-of-order segments

                                                                                                  A TCP spec doesnrsquot say - up to implementer

                                                                                                  Host BHost A

                                                                                                  Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                  Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                  Seq=43 ACK=80

                                                                                                  Usertypes

                                                                                                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                  back lsquoCrsquo

                                                                                                  host ACKsreceipt

                                                                                                  of echoedlsquoCrsquo

                                                                                                  timesimple telnet scenario

                                                                                                  3 Transport Layer 63Comp 361 Spring 2005

                                                                                                  TCP Round Trip Time and Timeout

                                                                                                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                  average several recent measurements not just current SampleRTT

                                                                                                  Q how to set TCP timeout valuelonger than RTT

                                                                                                  but RTT variestoo short premature timeout

                                                                                                  unnecessary retransmissions

                                                                                                  too long slow reaction to segment loss

                                                                                                  3 Transport Layer 64Comp 361 Spring 2005

                                                                                                  TCP Round Trip Time and Timeout

                                                                                                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                  3 Transport Layer 65Comp 361 Spring 2005

                                                                                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                  100

                                                                                                  150

                                                                                                  200

                                                                                                  250

                                                                                                  300

                                                                                                  350

                                                                                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                  time (seconnds)

                                                                                                  RTT

                                                                                                  (mill

                                                                                                  iseco

                                                                                                  nds)

                                                                                                  SampleRTT Estimated RTT

                                                                                                  3 Transport Layer 66Comp 361 Spring 2005

                                                                                                  TCP Round Trip Time and Timeout

                                                                                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                  (typically β = 025)

                                                                                                  Then set timeout interval

                                                                                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                  3 Transport Layer 67Comp 361 Spring 2005

                                                                                                  Chapter 3 outline

                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                  35 Connection-oriented transport TCP

                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                  3 Transport Layer 68Comp 361 Spring 2005

                                                                                                  TCP reliable data transfer

                                                                                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                  Retransmissions are triggered by

                                                                                                  timeout eventsduplicate acks

                                                                                                  Initially consider simplified TCP sender

                                                                                                  ignore duplicate acksignore flow control congestion control

                                                                                                  3 Transport Layer 69Comp 361 Spring 2005

                                                                                                  TCP sender eventsdata rcvd from app

                                                                                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                  timeoutretransmit segment that caused timeoutrestart timer

                                                                                                  Ack rcvdIf acknowledges previously unackedsegments

                                                                                                  update what is known to be ackedstart timer if there are outstanding segments

                                                                                                  TCP sender(simplified)

                                                                                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                  loop (forever) switch(event)

                                                                                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                  smallest sequence numberstart timer

                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                  start timer

                                                                                                  end of loop forever

                                                                                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                  3 Transport Layer 70Comp 361 Spring 2005

                                                                                                  3 Transport Layer 71Comp 361 Spring 2005

                                                                                                  TCP retransmission scenariosHost A

                                                                                                  Seq=100 20 bytes data

                                                                                                  ACK=100

                                                                                                  timepremature timeout

                                                                                                  Host B

                                                                                                  Seq=92 8 bytes data

                                                                                                  ACK=120

                                                                                                  Seq=92 8 bytes data

                                                                                                  Seq=

                                                                                                  92 t

                                                                                                  imeo

                                                                                                  ut

                                                                                                  ACK=120

                                                                                                  Host A

                                                                                                  Seq=92 8 bytes data

                                                                                                  ACK=100

                                                                                                  loss

                                                                                                  tim

                                                                                                  eout

                                                                                                  lost ACK scenario

                                                                                                  Host B

                                                                                                  X

                                                                                                  Seq=92 8 bytes data

                                                                                                  ACK=100

                                                                                                  time

                                                                                                  SendBase= 120

                                                                                                  SendBase= 120

                                                                                                  Sendbase= 100

                                                                                                  Seq=

                                                                                                  92 t

                                                                                                  imeo

                                                                                                  utSendBase

                                                                                                  = 100

                                                                                                  3 Transport Layer 72Comp 361 Spring 2005

                                                                                                  TCP retransmission scenarios (more)Host A

                                                                                                  Seq=92 8 bytes data

                                                                                                  ACK=100

                                                                                                  loss

                                                                                                  tim

                                                                                                  eout

                                                                                                  Cumulative ACK scenario

                                                                                                  Host B

                                                                                                  X

                                                                                                  Seq=100 20 bytes data

                                                                                                  ACK=120

                                                                                                  time

                                                                                                  SendBase= 120

                                                                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                  Event at Receiver

                                                                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                  Arrival of segment that partially or completely fills gap

                                                                                                  TCP Receiver action

                                                                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                                                                  More on Sender Policies

                                                                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                                                                  Fast Retransmit

                                                                                                  Time-out period often relatively long

                                                                                                  long delay before resending lost packet

                                                                                                  Detect lost segments via duplicate ACKs

                                                                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                  fast retransmit resend segment before timer expires

                                                                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                                                                  Fast retransmit algorithm

                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                  start timer

                                                                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                  resend segment with sequence number y

                                                                                                  a duplicate ACK for already ACKed segment

                                                                                                  fast retransmit

                                                                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                                                                  TCP GBN or Selective Repeat

                                                                                                  Basic TCP looks a lot like GBN

                                                                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                  This looks a lot like Selective Repeat

                                                                                                  TCP is a hybrid

                                                                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                                                                  Chapter 3 outline

                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                  35 Connection-oriented transport TCP

                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                                                                  TCP Flow Control

                                                                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                  transmitting too muchtoo fast

                                                                                                  flow controlreceive side of TCP connection has a receive buffer

                                                                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                  app process may be slow at reading from buffer

                                                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                                                  TCP segment structure

                                                                                                  source port dest port

                                                                                                  32 bits

                                                                                                  applicationdata

                                                                                                  (variable length)

                                                                                                  sequence numberacknowledgement number

                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                  FSRPAUheadlen

                                                                                                  notused

                                                                                                  Options (variable length)

                                                                                                  URG urgent data (generally not used)

                                                                                                  ACK ACK valid

                                                                                                  PSH push data now(generally not used)

                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                  commands)

                                                                                                  bytes rcvr willingto accept

                                                                                                  Internetchecksum

                                                                                                  (as in UDP)

                                                                                                  countingby bytes of data(not segments)

                                                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                                                  TCP Flow control how it works

                                                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                  LastByteRead]

                                                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                  guarantees receive buffer doesnrsquot overflow

                                                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                                                  Technical Issue

                                                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                                                  Note on UDP

                                                                                                  UDP has no flow control

                                                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                                                  Chapter 3 outline

                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                  35 Connection-oriented transport TCP

                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                                                  TCP Connection Management

                                                                                                  Three way handshakeStep 1 client end system sends

                                                                                                  TCP SYN control segment to server

                                                                                                  specifies client_isn the initial seq No application data

                                                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                                                  TCP Connection Management (cont)

                                                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                  Allocate buffersAllocates buffersCan include application data

                                                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                                                  server

                                                                                                  Connection granted (SYN=1 server_isn

                                                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                                                  ack=client_isn+1)

                                                                                                  ack=server_isn+1

                                                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                                                  TCP Connection Management (cont)

                                                                                                  Closing a connection

                                                                                                  client closes socketclientSocketclose()

                                                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                  client

                                                                                                  FIN

                                                                                                  server

                                                                                                  ACK

                                                                                                  ACK

                                                                                                  FIN

                                                                                                  close

                                                                                                  close

                                                                                                  closed

                                                                                                  tim

                                                                                                  ed w

                                                                                                  ait

                                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                                  TCP Connection Management (cont)

                                                                                                  Step 3 client receives FIN replies with ACK

                                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                  Closes down after timed-wait

                                                                                                  Step 4 server receives ACK Connection closed

                                                                                                  Note with small modification can handle simultaneous FINs

                                                                                                  client

                                                                                                  FIN

                                                                                                  server

                                                                                                  ACK

                                                                                                  ACK

                                                                                                  FIN

                                                                                                  closing

                                                                                                  closing

                                                                                                  closed

                                                                                                  tim

                                                                                                  ed w

                                                                                                  ait

                                                                                                  closed

                                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                                  TCP Connection Management (cont)

                                                                                                  ExampleTCP serverlifecycle

                                                                                                  Example TCP clientlifecycle

                                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                                  A few special cases

                                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                                  Chapter 3 outline

                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                  35 Connection-oriented transport TCP

                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                                  Principles of Congestion Control

                                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                  a top-10 problem

                                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                  large delays when congestedmaximum achievable throughput

                                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                                  Causescosts of congestion scenario 2

                                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                  λin λout=

                                                                                                  λin λoutgtλ

                                                                                                  inλout

                                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                  (c)(a) (b)

                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                  λin

                                                                                                  Q what happens as and increase λ

                                                                                                  in

                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                  Causescosts of congestion scenario 3

                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                  Approaches towards congestion control

                                                                                                  Two broad approaches towards congestion control

                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                  Case study ATM ABR congestion control

                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                  small exception ndash see next page

                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                  sender should use available bandwidth

                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                  Case study ATM ABR congestion control

                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                  Chapter 3 outline

                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                  35 Connection-oriented transport TCP

                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                  Congwin

                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                  cut CongWin in half after loss event

                                                                                                  8 Kbytes

                                                                                                  16 Kbytes

                                                                                                  24 Kbytes

                                                                                                  time

                                                                                                  congestionwindow

                                                                                                  Long-lived TCP connection

                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                  TCP Slow Start

                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                  TCP Slow Start (more)

                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                  Host A

                                                                                                  one segment

                                                                                                  RTT

                                                                                                  Host B

                                                                                                  time

                                                                                                  two segments

                                                                                                  four segments

                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                  Summary TCP Congestion Control

                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                  The Big Picture

                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                  ACK receipt for previously unackeddata

                                                                                                  Slow Start (SS)

                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                  ACK receipt for previously unackeddata

                                                                                                  CongestionAvoidance (CA)

                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                  Enter slow start

                                                                                                  Duplicate ACK

                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                  CongWin and Threshold not changed

                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                  TCP throughput

                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                  TCP Futures

                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                  LRTTMSSsdot221

                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                  TCP connection 1

                                                                                                  bottleneckrouter

                                                                                                  capacity R

                                                                                                  TCP connection 2

                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                  R

                                                                                                  R

                                                                                                  equal bandwidth share

                                                                                                  Connection 1 throughput

                                                                                                  Conn

                                                                                                  ecti

                                                                                                  on 2

                                                                                                  thr

                                                                                                  ough

                                                                                                  p ut

                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                  Fairness (more)Fairness and UDP

                                                                                                  Multimedia apps often do not use TCP

                                                                                                  do not want rate throttled by congestion control

                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                  modeling slow start

                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                  Fixed congestion window (1)

                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                  latency = 2RTT + OR

                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                  Fixed congestion window (2)

                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                  Will show that the delay for one object is

                                                                                                  RS

                                                                                                  RSRTTP

                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                  - and K is the number of windows that cover the object

                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                  RTT

                                                                                                  initiate TCPconnection

                                                                                                  requestobject

                                                                                                  first window= SR

                                                                                                  second window= 2SR

                                                                                                  third window= 4SR

                                                                                                  fourth window= 8SR

                                                                                                  completetransmissionobject

                                                                                                  delivered

                                                                                                  time atclient

                                                                                                  time atserver

                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                  Server idles P=2 times

                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                  Server idles P = minK-1Q times

                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                  TCP Latency Modeling (3)

                                                                                                  ementacknowledg receivesserver until

                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                  RS

                                                                                                  RSRTTPRTT

                                                                                                  RO

                                                                                                  RSRTT

                                                                                                  RSRTT

                                                                                                  RO

                                                                                                  idleTimeRTTRO

                                                                                                  P

                                                                                                  kP

                                                                                                  k

                                                                                                  P

                                                                                                  pp

                                                                                                  )12(][2

                                                                                                  ]2[2

                                                                                                  2delay

                                                                                                  1

                                                                                                  1

                                                                                                  1

                                                                                                  minusminus+++=

                                                                                                  minus+++=

                                                                                                  ++=

                                                                                                  minus

                                                                                                  =

                                                                                                  =

                                                                                                  sum

                                                                                                  sum

                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                  RS k =⎥⎦

                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                  +minus

                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                  RSk

                                                                                                  RTT

                                                                                                  initiate TCPconnection

                                                                                                  requestobject

                                                                                                  first window= SR

                                                                                                  second window= 2SR

                                                                                                  third window= 4SR

                                                                                                  fourth window= 8SR

                                                                                                  completetransmissionobject

                                                                                                  delivered

                                                                                                  time atclient

                                                                                                  time atserver

                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                  How do we calculate K

                                                                                                  ⎥⎥⎤

                                                                                                  ⎢⎢⎡ +=

                                                                                                  +ge=

                                                                                                  geminus=

                                                                                                  ge+++=

                                                                                                  ge+++=minus

                                                                                                  minus

                                                                                                  )1(log

                                                                                                  )1(logmin

                                                                                                  12min

                                                                                                  222min222min

                                                                                                  2

                                                                                                  2

                                                                                                  110

                                                                                                  110

                                                                                                  SO

                                                                                                  SOkk

                                                                                                  SOk

                                                                                                  SOkOSSSkK

                                                                                                  k

                                                                                                  k

                                                                                                  k

                                                                                                  L

                                                                                                  L

                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                  02468

                                                                                                  101214161820

                                                                                                  28Kbps

                                                                                                  100Kbps

                                                                                                  1 Mbps 10Mbps

                                                                                                  non-persistent

                                                                                                  persistent

                                                                                                  parallel non-persistent

                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                  HTTP Response time (in seconds)

                                                                                                  0

                                                                                                  10

                                                                                                  20

                                                                                                  30

                                                                                                  40

                                                                                                  50

                                                                                                  60

                                                                                                  70

                                                                                                  28Kbps

                                                                                                  100Kbps

                                                                                                  1 Mbps 10Mbps

                                                                                                  non-persistent

                                                                                                  persistent

                                                                                                  parallel non-persistent

                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                  instantiation and implementation in the Internet

                                                                                                  UDPTCP

                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                  • Chapter 3 outline
                                                                                                  • Transport services and protocols
                                                                                                  • Transport vs network layer
                                                                                                  • Transport-layer protocols
                                                                                                  • Chapter 3 outline
                                                                                                  • Multiplexingdemultiplexing
                                                                                                  • Multiplexingdemultiplexing
                                                                                                  • How demultiplexing works
                                                                                                  • Connectionless demultiplexing
                                                                                                  • Connectionless demux (cont)
                                                                                                  • Connection-oriented demux
                                                                                                  • Connection-oriented demux (cont)
                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                  • Chapter 3 outline
                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                  • UDP more
                                                                                                  • UDP checksum
                                                                                                  • Chapter 3 outline
                                                                                                  • Principles of Reliable data transfer
                                                                                                  • Reliable data transfer getting started
                                                                                                  • Reliable data transfer getting started
                                                                                                  • Incremental Improvements
                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                  • Rdt20 channel with bit errors
                                                                                                  • rdt20 FSM specification
                                                                                                  • rdt20 operation with no errors
                                                                                                  • rdt20 error scenario
                                                                                                  • rdt20 has a fatal flaw
                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                  • rdt21 discussion
                                                                                                  • rdt22 a NAK-free protocol
                                                                                                  • rdt22 sender receiver fragments
                                                                                                  • rdt30 channels with errors and loss
                                                                                                  • rdt30 sender
                                                                                                  • rdt30 in action
                                                                                                  • rdt30 in action
                                                                                                  • Performance of rdt30
                                                                                                  • rdt30 stop-and-wait operation
                                                                                                  • Pipelined protocols
                                                                                                  • Pipelined protocols
                                                                                                  • Pipelining increased utilization
                                                                                                  • Go-Back-N
                                                                                                  • GBN Sender
                                                                                                  • GBN sender extended FSM
                                                                                                  • GBN receiver extended FSM
                                                                                                  • More on receiver
                                                                                                  • GBN inaction
                                                                                                  • Selective Repeat
                                                                                                  • Selective repeat sender receiver windows
                                                                                                  • Selective repeat
                                                                                                  • Selective repeat in action
                                                                                                  • Selective repeat dilemma
                                                                                                  • Chapter 3 outline
                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                  • More TCP Details
                                                                                                  • Even More TCP Details
                                                                                                  • TCP segment structure
                                                                                                  • TCP seq rsquos and ACKs
                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                  • Example RTT estimation
                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                  • Chapter 3 outline
                                                                                                  • TCP reliable data transfer
                                                                                                  • TCP sender events
                                                                                                  • TCP sender(simplified)
                                                                                                  • TCP retransmission scenarios
                                                                                                  • TCP retransmission scenarios (more)
                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                  • More on Sender Policies
                                                                                                  • Fast Retransmit
                                                                                                  • Fast retransmit algorithm
                                                                                                  • TCP GBN or Selective Repeat
                                                                                                  • Chapter 3 outline
                                                                                                  • TCP Flow Control
                                                                                                  • TCP Flow Control
                                                                                                  • TCP segment structure
                                                                                                  • TCP Flow control how it works
                                                                                                  • Technical Issue
                                                                                                  • Chapter 3 outline
                                                                                                  • TCP Connection Management
                                                                                                  • TCP Connection Management (cont)
                                                                                                  • TCP Connection Management (cont)
                                                                                                  • TCP Connection Management (cont)
                                                                                                  • TCP Connection Management (cont)
                                                                                                  • A few special cases
                                                                                                  • Chapter 3 outline
                                                                                                  • Principles of Congestion Control
                                                                                                  • Causescosts of congestion scenario 1
                                                                                                  • Causescosts of congestion scenario 2
                                                                                                  • Causescosts of congestion scenario 3
                                                                                                  • Causescosts of congestion scenario 3
                                                                                                  • Approaches towards congestion control
                                                                                                  • Case study ATM ABR congestion control
                                                                                                  • Case study ATM ABR congestion control
                                                                                                  • Chapter 3 outline
                                                                                                  • TCP Congestion Control
                                                                                                  • TCP AIMD
                                                                                                  • TCP Slow Start
                                                                                                  • TCP Slow Start (more)
                                                                                                  • Summary TCP Congestion Control
                                                                                                  • The Big Picture
                                                                                                  • TCP sender congestion control
                                                                                                  • TCP throughput
                                                                                                  • TCP Futures
                                                                                                  • TCP Fairness
                                                                                                  • Why is TCP fair
                                                                                                  • Fairness (more)
                                                                                                  • TCP Latency Modeling
                                                                                                  • Fixed Congestion Window (W)
                                                                                                  • Fixed congestion window (1)
                                                                                                  • Fixed congestion window (2)
                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                  • TCP Latency Modeling (3)
                                                                                                  • TCP Latency Modeling (4)
                                                                                                  • HTTP Modeling
                                                                                                  • Chapter 3 Summary

                                                                                                    3 Transport Layer 50Comp 361 Spring 2005

                                                                                                    GBN inaction

                                                                                                    GBN is easy to code but might have performance problems

                                                                                                    In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                                    Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                                    3 Transport Layer 51Comp 361 Spring 2005

                                                                                                    3 Transport Layer 52Comp 361 Spring 2005

                                                                                                    Selective Repeat

                                                                                                    receiver individually acknowledges all correctly received pkts

                                                                                                    buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                                    sender only resends pkts for which ACK not received

                                                                                                    sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                                    sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                                    3 Transport Layer 53Comp 361 Spring 2005

                                                                                                    Selective repeat sender receiver windows

                                                                                                    3 Transport Layer 54Comp 361 Spring 2005

                                                                                                    Selective repeat

                                                                                                    pkt n in [rcvbase rcvbase+N-1]

                                                                                                    send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                    pkt n in [rcvbase-Nrcvbase-1]

                                                                                                    ACK(n) (note this is a reACK)

                                                                                                    otherwiseignore

                                                                                                    receiverdata from above

                                                                                                    if next available seq in window send pkt

                                                                                                    timeout(n)resend pkt n restart timer

                                                                                                    ACK(n) in [sendbasesendbase+N]

                                                                                                    mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                    sender

                                                                                                    3 Transport Layer 55Comp 361 Spring 2005

                                                                                                    Selective repeat in action

                                                                                                    3 Transport Layer 56Comp 361 Spring 2005

                                                                                                    Selective repeatdilemma

                                                                                                    Example seq rsquos 0 1 2 3window size=3

                                                                                                    receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                    Q what is relationship between seq size and window size

                                                                                                    3 Transport Layer 57Comp 361 Spring 2005

                                                                                                    Chapter 3 outline

                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                    35 Connection-oriented transport TCP

                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                    3 Transport Layer 58Comp 361 Spring 2005

                                                                                                    TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                    flow controlledsender will not overwhelm receiver

                                                                                                    point-to-pointone sender one receiver

                                                                                                    reliable in-order byte steam

                                                                                                    no ldquomessage boundariesrdquopipelined

                                                                                                    TCP congestion and flow control set window size

                                                                                                    send amp receive buffers

                                                                                                    socketdoor

                                                                                                    TCPsend buffer

                                                                                                    TCPreceive buffer

                                                                                                    socketdoor

                                                                                                    segment

                                                                                                    applicationwrites data

                                                                                                    applicationreads data

                                                                                                    3 Transport Layer 59Comp 361 Spring 2005

                                                                                                    More TCP DetailsMaximum Segment Size (MSS)

                                                                                                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                    Application Data + TCP Header = TCP Segment

                                                                                                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                    (again no payload)Client responds with third special segment

                                                                                                    This can contain payload

                                                                                                    3 Transport Layer 60Comp 361 Spring 2005

                                                                                                    Even More TCP Details

                                                                                                    A TCP connection between client and server creates in both client and server

                                                                                                    (i) buffers(ii) variables and

                                                                                                    (iii) a socket connection to process

                                                                                                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                    any of the network elements between the host and server

                                                                                                    3 Transport Layer 61Comp 361 Spring 2005

                                                                                                    TCP segment structure

                                                                                                    source port dest port

                                                                                                    32 bits

                                                                                                    applicationdata

                                                                                                    (variable length)

                                                                                                    sequence numberacknowledgement number

                                                                                                    Receive windowUrg data pnterchecksum

                                                                                                    FSRPAUheadlen

                                                                                                    notused

                                                                                                    Options (variable length)

                                                                                                    URG urgent data (generally not used)

                                                                                                    ACK ACK valid

                                                                                                    PSH push data now(generally not used)

                                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                                    commands)

                                                                                                    bytes rcvr willingto accept

                                                                                                    Internetchecksum

                                                                                                    (as in UDP)

                                                                                                    countingby bytes of data(not segments)

                                                                                                    3 Transport Layer 62Comp 361 Spring 2005

                                                                                                    TCP seq rsquos and ACKsSeq rsquos

                                                                                                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                    ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                    Q how receiver handles out-of-order segments

                                                                                                    A TCP spec doesnrsquot say - up to implementer

                                                                                                    Host BHost A

                                                                                                    Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                    Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                    Seq=43 ACK=80

                                                                                                    Usertypes

                                                                                                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                    back lsquoCrsquo

                                                                                                    host ACKsreceipt

                                                                                                    of echoedlsquoCrsquo

                                                                                                    timesimple telnet scenario

                                                                                                    3 Transport Layer 63Comp 361 Spring 2005

                                                                                                    TCP Round Trip Time and Timeout

                                                                                                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                    average several recent measurements not just current SampleRTT

                                                                                                    Q how to set TCP timeout valuelonger than RTT

                                                                                                    but RTT variestoo short premature timeout

                                                                                                    unnecessary retransmissions

                                                                                                    too long slow reaction to segment loss

                                                                                                    3 Transport Layer 64Comp 361 Spring 2005

                                                                                                    TCP Round Trip Time and Timeout

                                                                                                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                    3 Transport Layer 65Comp 361 Spring 2005

                                                                                                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                    100

                                                                                                    150

                                                                                                    200

                                                                                                    250

                                                                                                    300

                                                                                                    350

                                                                                                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                    time (seconnds)

                                                                                                    RTT

                                                                                                    (mill

                                                                                                    iseco

                                                                                                    nds)

                                                                                                    SampleRTT Estimated RTT

                                                                                                    3 Transport Layer 66Comp 361 Spring 2005

                                                                                                    TCP Round Trip Time and Timeout

                                                                                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                    (typically β = 025)

                                                                                                    Then set timeout interval

                                                                                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                    3 Transport Layer 67Comp 361 Spring 2005

                                                                                                    Chapter 3 outline

                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                    35 Connection-oriented transport TCP

                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                    3 Transport Layer 68Comp 361 Spring 2005

                                                                                                    TCP reliable data transfer

                                                                                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                    Retransmissions are triggered by

                                                                                                    timeout eventsduplicate acks

                                                                                                    Initially consider simplified TCP sender

                                                                                                    ignore duplicate acksignore flow control congestion control

                                                                                                    3 Transport Layer 69Comp 361 Spring 2005

                                                                                                    TCP sender eventsdata rcvd from app

                                                                                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                    timeoutretransmit segment that caused timeoutrestart timer

                                                                                                    Ack rcvdIf acknowledges previously unackedsegments

                                                                                                    update what is known to be ackedstart timer if there are outstanding segments

                                                                                                    TCP sender(simplified)

                                                                                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                    loop (forever) switch(event)

                                                                                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                    smallest sequence numberstart timer

                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                    start timer

                                                                                                    end of loop forever

                                                                                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                    3 Transport Layer 70Comp 361 Spring 2005

                                                                                                    3 Transport Layer 71Comp 361 Spring 2005

                                                                                                    TCP retransmission scenariosHost A

                                                                                                    Seq=100 20 bytes data

                                                                                                    ACK=100

                                                                                                    timepremature timeout

                                                                                                    Host B

                                                                                                    Seq=92 8 bytes data

                                                                                                    ACK=120

                                                                                                    Seq=92 8 bytes data

                                                                                                    Seq=

                                                                                                    92 t

                                                                                                    imeo

                                                                                                    ut

                                                                                                    ACK=120

                                                                                                    Host A

                                                                                                    Seq=92 8 bytes data

                                                                                                    ACK=100

                                                                                                    loss

                                                                                                    tim

                                                                                                    eout

                                                                                                    lost ACK scenario

                                                                                                    Host B

                                                                                                    X

                                                                                                    Seq=92 8 bytes data

                                                                                                    ACK=100

                                                                                                    time

                                                                                                    SendBase= 120

                                                                                                    SendBase= 120

                                                                                                    Sendbase= 100

                                                                                                    Seq=

                                                                                                    92 t

                                                                                                    imeo

                                                                                                    utSendBase

                                                                                                    = 100

                                                                                                    3 Transport Layer 72Comp 361 Spring 2005

                                                                                                    TCP retransmission scenarios (more)Host A

                                                                                                    Seq=92 8 bytes data

                                                                                                    ACK=100

                                                                                                    loss

                                                                                                    tim

                                                                                                    eout

                                                                                                    Cumulative ACK scenario

                                                                                                    Host B

                                                                                                    X

                                                                                                    Seq=100 20 bytes data

                                                                                                    ACK=120

                                                                                                    time

                                                                                                    SendBase= 120

                                                                                                    3 Transport Layer 73Comp 361 Spring 2005

                                                                                                    TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                    Event at Receiver

                                                                                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                    Arrival of segment that partially or completely fills gap

                                                                                                    TCP Receiver action

                                                                                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                    Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                    Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                                                                    More on Sender Policies

                                                                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                                                                    Fast Retransmit

                                                                                                    Time-out period often relatively long

                                                                                                    long delay before resending lost packet

                                                                                                    Detect lost segments via duplicate ACKs

                                                                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                    fast retransmit resend segment before timer expires

                                                                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                                                                    Fast retransmit algorithm

                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                    start timer

                                                                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                    resend segment with sequence number y

                                                                                                    a duplicate ACK for already ACKed segment

                                                                                                    fast retransmit

                                                                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                                                                    TCP GBN or Selective Repeat

                                                                                                    Basic TCP looks a lot like GBN

                                                                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                    This looks a lot like Selective Repeat

                                                                                                    TCP is a hybrid

                                                                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                                                                    Chapter 3 outline

                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                    35 Connection-oriented transport TCP

                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                                                                    TCP Flow Control

                                                                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                    transmitting too muchtoo fast

                                                                                                    flow controlreceive side of TCP connection has a receive buffer

                                                                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                    app process may be slow at reading from buffer

                                                                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                                                                    TCP segment structure

                                                                                                    source port dest port

                                                                                                    32 bits

                                                                                                    applicationdata

                                                                                                    (variable length)

                                                                                                    sequence numberacknowledgement number

                                                                                                    Receive windowUrg data pnterchecksum

                                                                                                    FSRPAUheadlen

                                                                                                    notused

                                                                                                    Options (variable length)

                                                                                                    URG urgent data (generally not used)

                                                                                                    ACK ACK valid

                                                                                                    PSH push data now(generally not used)

                                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                                    commands)

                                                                                                    bytes rcvr willingto accept

                                                                                                    Internetchecksum

                                                                                                    (as in UDP)

                                                                                                    countingby bytes of data(not segments)

                                                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                                                    TCP Flow control how it works

                                                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                    LastByteRead]

                                                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                    guarantees receive buffer doesnrsquot overflow

                                                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                                                    Technical Issue

                                                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                                                    Note on UDP

                                                                                                    UDP has no flow control

                                                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                                                    Chapter 3 outline

                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                    35 Connection-oriented transport TCP

                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                                                    TCP Connection Management

                                                                                                    Three way handshakeStep 1 client end system sends

                                                                                                    TCP SYN control segment to server

                                                                                                    specifies client_isn the initial seq No application data

                                                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                                                    TCP Connection Management (cont)

                                                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                    Allocate buffersAllocates buffersCan include application data

                                                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                                                    server

                                                                                                    Connection granted (SYN=1 server_isn

                                                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                                                    ack=client_isn+1)

                                                                                                    ack=server_isn+1

                                                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                                                    TCP Connection Management (cont)

                                                                                                    Closing a connection

                                                                                                    client closes socketclientSocketclose()

                                                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                    client

                                                                                                    FIN

                                                                                                    server

                                                                                                    ACK

                                                                                                    ACK

                                                                                                    FIN

                                                                                                    close

                                                                                                    close

                                                                                                    closed

                                                                                                    tim

                                                                                                    ed w

                                                                                                    ait

                                                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                                                    TCP Connection Management (cont)

                                                                                                    Step 3 client receives FIN replies with ACK

                                                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                    Closes down after timed-wait

                                                                                                    Step 4 server receives ACK Connection closed

                                                                                                    Note with small modification can handle simultaneous FINs

                                                                                                    client

                                                                                                    FIN

                                                                                                    server

                                                                                                    ACK

                                                                                                    ACK

                                                                                                    FIN

                                                                                                    closing

                                                                                                    closing

                                                                                                    closed

                                                                                                    tim

                                                                                                    ed w

                                                                                                    ait

                                                                                                    closed

                                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                                    TCP Connection Management (cont)

                                                                                                    ExampleTCP serverlifecycle

                                                                                                    Example TCP clientlifecycle

                                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                                    A few special cases

                                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                                    Chapter 3 outline

                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                    35 Connection-oriented transport TCP

                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                                    Principles of Congestion Control

                                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                    a top-10 problem

                                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                    large delays when congestedmaximum achievable throughput

                                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                                    Causescosts of congestion scenario 2

                                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                    λin λout=

                                                                                                    λin λoutgtλ

                                                                                                    inλout

                                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                    (c)(a) (b)

                                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                    λin

                                                                                                    Q what happens as and increase λ

                                                                                                    in

                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                    Causescosts of congestion scenario 3

                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                    Approaches towards congestion control

                                                                                                    Two broad approaches towards congestion control

                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                    Case study ATM ABR congestion control

                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                    small exception ndash see next page

                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                    sender should use available bandwidth

                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                    Case study ATM ABR congestion control

                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                    Chapter 3 outline

                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                    35 Connection-oriented transport TCP

                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                    Congwin

                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                    cut CongWin in half after loss event

                                                                                                    8 Kbytes

                                                                                                    16 Kbytes

                                                                                                    24 Kbytes

                                                                                                    time

                                                                                                    congestionwindow

                                                                                                    Long-lived TCP connection

                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                    TCP Slow Start

                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                    TCP Slow Start (more)

                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                    Host A

                                                                                                    one segment

                                                                                                    RTT

                                                                                                    Host B

                                                                                                    time

                                                                                                    two segments

                                                                                                    four segments

                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                    Summary TCP Congestion Control

                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                    The Big Picture

                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                    ACK receipt for previously unackeddata

                                                                                                    Slow Start (SS)

                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                    ACK receipt for previously unackeddata

                                                                                                    CongestionAvoidance (CA)

                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                    Enter slow start

                                                                                                    Duplicate ACK

                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                    CongWin and Threshold not changed

                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                    TCP throughput

                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                    TCP Futures

                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                    LRTTMSSsdot221

                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                    TCP connection 1

                                                                                                    bottleneckrouter

                                                                                                    capacity R

                                                                                                    TCP connection 2

                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                    R

                                                                                                    R

                                                                                                    equal bandwidth share

                                                                                                    Connection 1 throughput

                                                                                                    Conn

                                                                                                    ecti

                                                                                                    on 2

                                                                                                    thr

                                                                                                    ough

                                                                                                    p ut

                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                    Fairness (more)Fairness and UDP

                                                                                                    Multimedia apps often do not use TCP

                                                                                                    do not want rate throttled by congestion control

                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                    modeling slow start

                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                    Fixed congestion window (1)

                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                    latency = 2RTT + OR

                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                    Fixed congestion window (2)

                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                    Will show that the delay for one object is

                                                                                                    RS

                                                                                                    RSRTTP

                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                    - and K is the number of windows that cover the object

                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                    RTT

                                                                                                    initiate TCPconnection

                                                                                                    requestobject

                                                                                                    first window= SR

                                                                                                    second window= 2SR

                                                                                                    third window= 4SR

                                                                                                    fourth window= 8SR

                                                                                                    completetransmissionobject

                                                                                                    delivered

                                                                                                    time atclient

                                                                                                    time atserver

                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                    Server idles P=2 times

                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                    Server idles P = minK-1Q times

                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                    TCP Latency Modeling (3)

                                                                                                    ementacknowledg receivesserver until

                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                    RS

                                                                                                    RSRTTPRTT

                                                                                                    RO

                                                                                                    RSRTT

                                                                                                    RSRTT

                                                                                                    RO

                                                                                                    idleTimeRTTRO

                                                                                                    P

                                                                                                    kP

                                                                                                    k

                                                                                                    P

                                                                                                    pp

                                                                                                    )12(][2

                                                                                                    ]2[2

                                                                                                    2delay

                                                                                                    1

                                                                                                    1

                                                                                                    1

                                                                                                    minusminus+++=

                                                                                                    minus+++=

                                                                                                    ++=

                                                                                                    minus

                                                                                                    =

                                                                                                    =

                                                                                                    sum

                                                                                                    sum

                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                    RS k =⎥⎦

                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                    +minus

                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                    RSk

                                                                                                    RTT

                                                                                                    initiate TCPconnection

                                                                                                    requestobject

                                                                                                    first window= SR

                                                                                                    second window= 2SR

                                                                                                    third window= 4SR

                                                                                                    fourth window= 8SR

                                                                                                    completetransmissionobject

                                                                                                    delivered

                                                                                                    time atclient

                                                                                                    time atserver

                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                    How do we calculate K

                                                                                                    ⎥⎥⎤

                                                                                                    ⎢⎢⎡ +=

                                                                                                    +ge=

                                                                                                    geminus=

                                                                                                    ge+++=

                                                                                                    ge+++=minus

                                                                                                    minus

                                                                                                    )1(log

                                                                                                    )1(logmin

                                                                                                    12min

                                                                                                    222min222min

                                                                                                    2

                                                                                                    2

                                                                                                    110

                                                                                                    110

                                                                                                    SO

                                                                                                    SOkk

                                                                                                    SOk

                                                                                                    SOkOSSSkK

                                                                                                    k

                                                                                                    k

                                                                                                    k

                                                                                                    L

                                                                                                    L

                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                    02468

                                                                                                    101214161820

                                                                                                    28Kbps

                                                                                                    100Kbps

                                                                                                    1 Mbps 10Mbps

                                                                                                    non-persistent

                                                                                                    persistent

                                                                                                    parallel non-persistent

                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                    HTTP Response time (in seconds)

                                                                                                    0

                                                                                                    10

                                                                                                    20

                                                                                                    30

                                                                                                    40

                                                                                                    50

                                                                                                    60

                                                                                                    70

                                                                                                    28Kbps

                                                                                                    100Kbps

                                                                                                    1 Mbps 10Mbps

                                                                                                    non-persistent

                                                                                                    persistent

                                                                                                    parallel non-persistent

                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                    instantiation and implementation in the Internet

                                                                                                    UDPTCP

                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                    • Chapter 3 outline
                                                                                                    • Transport services and protocols
                                                                                                    • Transport vs network layer
                                                                                                    • Transport-layer protocols
                                                                                                    • Chapter 3 outline
                                                                                                    • Multiplexingdemultiplexing
                                                                                                    • Multiplexingdemultiplexing
                                                                                                    • How demultiplexing works
                                                                                                    • Connectionless demultiplexing
                                                                                                    • Connectionless demux (cont)
                                                                                                    • Connection-oriented demux
                                                                                                    • Connection-oriented demux (cont)
                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                    • Chapter 3 outline
                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                    • UDP more
                                                                                                    • UDP checksum
                                                                                                    • Chapter 3 outline
                                                                                                    • Principles of Reliable data transfer
                                                                                                    • Reliable data transfer getting started
                                                                                                    • Reliable data transfer getting started
                                                                                                    • Incremental Improvements
                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                    • Rdt20 channel with bit errors
                                                                                                    • rdt20 FSM specification
                                                                                                    • rdt20 operation with no errors
                                                                                                    • rdt20 error scenario
                                                                                                    • rdt20 has a fatal flaw
                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                    • rdt21 discussion
                                                                                                    • rdt22 a NAK-free protocol
                                                                                                    • rdt22 sender receiver fragments
                                                                                                    • rdt30 channels with errors and loss
                                                                                                    • rdt30 sender
                                                                                                    • rdt30 in action
                                                                                                    • rdt30 in action
                                                                                                    • Performance of rdt30
                                                                                                    • rdt30 stop-and-wait operation
                                                                                                    • Pipelined protocols
                                                                                                    • Pipelined protocols
                                                                                                    • Pipelining increased utilization
                                                                                                    • Go-Back-N
                                                                                                    • GBN Sender
                                                                                                    • GBN sender extended FSM
                                                                                                    • GBN receiver extended FSM
                                                                                                    • More on receiver
                                                                                                    • GBN inaction
                                                                                                    • Selective Repeat
                                                                                                    • Selective repeat sender receiver windows
                                                                                                    • Selective repeat
                                                                                                    • Selective repeat in action
                                                                                                    • Selective repeat dilemma
                                                                                                    • Chapter 3 outline
                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                    • More TCP Details
                                                                                                    • Even More TCP Details
                                                                                                    • TCP segment structure
                                                                                                    • TCP seq rsquos and ACKs
                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                    • Example RTT estimation
                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                    • Chapter 3 outline
                                                                                                    • TCP reliable data transfer
                                                                                                    • TCP sender events
                                                                                                    • TCP sender(simplified)
                                                                                                    • TCP retransmission scenarios
                                                                                                    • TCP retransmission scenarios (more)
                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                    • More on Sender Policies
                                                                                                    • Fast Retransmit
                                                                                                    • Fast retransmit algorithm
                                                                                                    • TCP GBN or Selective Repeat
                                                                                                    • Chapter 3 outline
                                                                                                    • TCP Flow Control
                                                                                                    • TCP Flow Control
                                                                                                    • TCP segment structure
                                                                                                    • TCP Flow control how it works
                                                                                                    • Technical Issue
                                                                                                    • Chapter 3 outline
                                                                                                    • TCP Connection Management
                                                                                                    • TCP Connection Management (cont)
                                                                                                    • TCP Connection Management (cont)
                                                                                                    • TCP Connection Management (cont)
                                                                                                    • TCP Connection Management (cont)
                                                                                                    • A few special cases
                                                                                                    • Chapter 3 outline
                                                                                                    • Principles of Congestion Control
                                                                                                    • Causescosts of congestion scenario 1
                                                                                                    • Causescosts of congestion scenario 2
                                                                                                    • Causescosts of congestion scenario 3
                                                                                                    • Causescosts of congestion scenario 3
                                                                                                    • Approaches towards congestion control
                                                                                                    • Case study ATM ABR congestion control
                                                                                                    • Case study ATM ABR congestion control
                                                                                                    • Chapter 3 outline
                                                                                                    • TCP Congestion Control
                                                                                                    • TCP AIMD
                                                                                                    • TCP Slow Start
                                                                                                    • TCP Slow Start (more)
                                                                                                    • Summary TCP Congestion Control
                                                                                                    • The Big Picture
                                                                                                    • TCP sender congestion control
                                                                                                    • TCP throughput
                                                                                                    • TCP Futures
                                                                                                    • TCP Fairness
                                                                                                    • Why is TCP fair
                                                                                                    • Fairness (more)
                                                                                                    • TCP Latency Modeling
                                                                                                    • Fixed Congestion Window (W)
                                                                                                    • Fixed congestion window (1)
                                                                                                    • Fixed congestion window (2)
                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                    • TCP Latency Modeling (3)
                                                                                                    • TCP Latency Modeling (4)
                                                                                                    • HTTP Modeling
                                                                                                    • Chapter 3 Summary

                                                                                                      GBN is easy to code but might have performance problems

                                                                                                      In particular if many packets are in pipeline at one time (bandwidth-delay product large) then one error can force retransmission of huge amounts of data

                                                                                                      Selective Repeat protocol allows receiver to buffer data and only forces retransmission of required packets

                                                                                                      3 Transport Layer 51Comp 361 Spring 2005

                                                                                                      3 Transport Layer 52Comp 361 Spring 2005

                                                                                                      Selective Repeat

                                                                                                      receiver individually acknowledges all correctly received pkts

                                                                                                      buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                                      sender only resends pkts for which ACK not received

                                                                                                      sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                                      sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                                      3 Transport Layer 53Comp 361 Spring 2005

                                                                                                      Selective repeat sender receiver windows

                                                                                                      3 Transport Layer 54Comp 361 Spring 2005

                                                                                                      Selective repeat

                                                                                                      pkt n in [rcvbase rcvbase+N-1]

                                                                                                      send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                      pkt n in [rcvbase-Nrcvbase-1]

                                                                                                      ACK(n) (note this is a reACK)

                                                                                                      otherwiseignore

                                                                                                      receiverdata from above

                                                                                                      if next available seq in window send pkt

                                                                                                      timeout(n)resend pkt n restart timer

                                                                                                      ACK(n) in [sendbasesendbase+N]

                                                                                                      mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                      sender

                                                                                                      3 Transport Layer 55Comp 361 Spring 2005

                                                                                                      Selective repeat in action

                                                                                                      3 Transport Layer 56Comp 361 Spring 2005

                                                                                                      Selective repeatdilemma

                                                                                                      Example seq rsquos 0 1 2 3window size=3

                                                                                                      receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                      Q what is relationship between seq size and window size

                                                                                                      3 Transport Layer 57Comp 361 Spring 2005

                                                                                                      Chapter 3 outline

                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                      35 Connection-oriented transport TCP

                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                      3 Transport Layer 58Comp 361 Spring 2005

                                                                                                      TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                      full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                      connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                      flow controlledsender will not overwhelm receiver

                                                                                                      point-to-pointone sender one receiver

                                                                                                      reliable in-order byte steam

                                                                                                      no ldquomessage boundariesrdquopipelined

                                                                                                      TCP congestion and flow control set window size

                                                                                                      send amp receive buffers

                                                                                                      socketdoor

                                                                                                      TCPsend buffer

                                                                                                      TCPreceive buffer

                                                                                                      socketdoor

                                                                                                      segment

                                                                                                      applicationwrites data

                                                                                                      applicationreads data

                                                                                                      3 Transport Layer 59Comp 361 Spring 2005

                                                                                                      More TCP DetailsMaximum Segment Size (MSS)

                                                                                                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                      Application Data + TCP Header = TCP Segment

                                                                                                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                      (again no payload)Client responds with third special segment

                                                                                                      This can contain payload

                                                                                                      3 Transport Layer 60Comp 361 Spring 2005

                                                                                                      Even More TCP Details

                                                                                                      A TCP connection between client and server creates in both client and server

                                                                                                      (i) buffers(ii) variables and

                                                                                                      (iii) a socket connection to process

                                                                                                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                      any of the network elements between the host and server

                                                                                                      3 Transport Layer 61Comp 361 Spring 2005

                                                                                                      TCP segment structure

                                                                                                      source port dest port

                                                                                                      32 bits

                                                                                                      applicationdata

                                                                                                      (variable length)

                                                                                                      sequence numberacknowledgement number

                                                                                                      Receive windowUrg data pnterchecksum

                                                                                                      FSRPAUheadlen

                                                                                                      notused

                                                                                                      Options (variable length)

                                                                                                      URG urgent data (generally not used)

                                                                                                      ACK ACK valid

                                                                                                      PSH push data now(generally not used)

                                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                                      commands)

                                                                                                      bytes rcvr willingto accept

                                                                                                      Internetchecksum

                                                                                                      (as in UDP)

                                                                                                      countingby bytes of data(not segments)

                                                                                                      3 Transport Layer 62Comp 361 Spring 2005

                                                                                                      TCP seq rsquos and ACKsSeq rsquos

                                                                                                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                      ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                      Q how receiver handles out-of-order segments

                                                                                                      A TCP spec doesnrsquot say - up to implementer

                                                                                                      Host BHost A

                                                                                                      Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                      Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                      Seq=43 ACK=80

                                                                                                      Usertypes

                                                                                                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                      back lsquoCrsquo

                                                                                                      host ACKsreceipt

                                                                                                      of echoedlsquoCrsquo

                                                                                                      timesimple telnet scenario

                                                                                                      3 Transport Layer 63Comp 361 Spring 2005

                                                                                                      TCP Round Trip Time and Timeout

                                                                                                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                      average several recent measurements not just current SampleRTT

                                                                                                      Q how to set TCP timeout valuelonger than RTT

                                                                                                      but RTT variestoo short premature timeout

                                                                                                      unnecessary retransmissions

                                                                                                      too long slow reaction to segment loss

                                                                                                      3 Transport Layer 64Comp 361 Spring 2005

                                                                                                      TCP Round Trip Time and Timeout

                                                                                                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                      3 Transport Layer 65Comp 361 Spring 2005

                                                                                                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                      100

                                                                                                      150

                                                                                                      200

                                                                                                      250

                                                                                                      300

                                                                                                      350

                                                                                                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                      time (seconnds)

                                                                                                      RTT

                                                                                                      (mill

                                                                                                      iseco

                                                                                                      nds)

                                                                                                      SampleRTT Estimated RTT

                                                                                                      3 Transport Layer 66Comp 361 Spring 2005

                                                                                                      TCP Round Trip Time and Timeout

                                                                                                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                      (typically β = 025)

                                                                                                      Then set timeout interval

                                                                                                      TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                      3 Transport Layer 67Comp 361 Spring 2005

                                                                                                      Chapter 3 outline

                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                      35 Connection-oriented transport TCP

                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                      3 Transport Layer 68Comp 361 Spring 2005

                                                                                                      TCP reliable data transfer

                                                                                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                      Retransmissions are triggered by

                                                                                                      timeout eventsduplicate acks

                                                                                                      Initially consider simplified TCP sender

                                                                                                      ignore duplicate acksignore flow control congestion control

                                                                                                      3 Transport Layer 69Comp 361 Spring 2005

                                                                                                      TCP sender eventsdata rcvd from app

                                                                                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                      timeoutretransmit segment that caused timeoutrestart timer

                                                                                                      Ack rcvdIf acknowledges previously unackedsegments

                                                                                                      update what is known to be ackedstart timer if there are outstanding segments

                                                                                                      TCP sender(simplified)

                                                                                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                      loop (forever) switch(event)

                                                                                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                      smallest sequence numberstart timer

                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                      start timer

                                                                                                      end of loop forever

                                                                                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                      3 Transport Layer 70Comp 361 Spring 2005

                                                                                                      3 Transport Layer 71Comp 361 Spring 2005

                                                                                                      TCP retransmission scenariosHost A

                                                                                                      Seq=100 20 bytes data

                                                                                                      ACK=100

                                                                                                      timepremature timeout

                                                                                                      Host B

                                                                                                      Seq=92 8 bytes data

                                                                                                      ACK=120

                                                                                                      Seq=92 8 bytes data

                                                                                                      Seq=

                                                                                                      92 t

                                                                                                      imeo

                                                                                                      ut

                                                                                                      ACK=120

                                                                                                      Host A

                                                                                                      Seq=92 8 bytes data

                                                                                                      ACK=100

                                                                                                      loss

                                                                                                      tim

                                                                                                      eout

                                                                                                      lost ACK scenario

                                                                                                      Host B

                                                                                                      X

                                                                                                      Seq=92 8 bytes data

                                                                                                      ACK=100

                                                                                                      time

                                                                                                      SendBase= 120

                                                                                                      SendBase= 120

                                                                                                      Sendbase= 100

                                                                                                      Seq=

                                                                                                      92 t

                                                                                                      imeo

                                                                                                      utSendBase

                                                                                                      = 100

                                                                                                      3 Transport Layer 72Comp 361 Spring 2005

                                                                                                      TCP retransmission scenarios (more)Host A

                                                                                                      Seq=92 8 bytes data

                                                                                                      ACK=100

                                                                                                      loss

                                                                                                      tim

                                                                                                      eout

                                                                                                      Cumulative ACK scenario

                                                                                                      Host B

                                                                                                      X

                                                                                                      Seq=100 20 bytes data

                                                                                                      ACK=120

                                                                                                      time

                                                                                                      SendBase= 120

                                                                                                      3 Transport Layer 73Comp 361 Spring 2005

                                                                                                      TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                      Event at Receiver

                                                                                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                      Arrival of segment that partially or completely fills gap

                                                                                                      TCP Receiver action

                                                                                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                      Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                      Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                      3 Transport Layer 74Comp 361 Spring 2005

                                                                                                      More on Sender Policies

                                                                                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                                                                      Fast Retransmit

                                                                                                      Time-out period often relatively long

                                                                                                      long delay before resending lost packet

                                                                                                      Detect lost segments via duplicate ACKs

                                                                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                      fast retransmit resend segment before timer expires

                                                                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                                                                      Fast retransmit algorithm

                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                      start timer

                                                                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                      resend segment with sequence number y

                                                                                                      a duplicate ACK for already ACKed segment

                                                                                                      fast retransmit

                                                                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                                                                      TCP GBN or Selective Repeat

                                                                                                      Basic TCP looks a lot like GBN

                                                                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                      This looks a lot like Selective Repeat

                                                                                                      TCP is a hybrid

                                                                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                                                                      Chapter 3 outline

                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                      35 Connection-oriented transport TCP

                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                                                                      TCP Flow Control

                                                                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                      transmitting too muchtoo fast

                                                                                                      flow controlreceive side of TCP connection has a receive buffer

                                                                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                      app process may be slow at reading from buffer

                                                                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                                                                      TCP segment structure

                                                                                                      source port dest port

                                                                                                      32 bits

                                                                                                      applicationdata

                                                                                                      (variable length)

                                                                                                      sequence numberacknowledgement number

                                                                                                      Receive windowUrg data pnterchecksum

                                                                                                      FSRPAUheadlen

                                                                                                      notused

                                                                                                      Options (variable length)

                                                                                                      URG urgent data (generally not used)

                                                                                                      ACK ACK valid

                                                                                                      PSH push data now(generally not used)

                                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                                      commands)

                                                                                                      bytes rcvr willingto accept

                                                                                                      Internetchecksum

                                                                                                      (as in UDP)

                                                                                                      countingby bytes of data(not segments)

                                                                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                                                                      TCP Flow control how it works

                                                                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                      LastByteRead]

                                                                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                      guarantees receive buffer doesnrsquot overflow

                                                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                                                      Technical Issue

                                                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                                                      Note on UDP

                                                                                                      UDP has no flow control

                                                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                                                      Chapter 3 outline

                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                      35 Connection-oriented transport TCP

                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                                                      TCP Connection Management

                                                                                                      Three way handshakeStep 1 client end system sends

                                                                                                      TCP SYN control segment to server

                                                                                                      specifies client_isn the initial seq No application data

                                                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                                                      TCP Connection Management (cont)

                                                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                      Allocate buffersAllocates buffersCan include application data

                                                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                                                      server

                                                                                                      Connection granted (SYN=1 server_isn

                                                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                                                      ack=client_isn+1)

                                                                                                      ack=server_isn+1

                                                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                                                      TCP Connection Management (cont)

                                                                                                      Closing a connection

                                                                                                      client closes socketclientSocketclose()

                                                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                      client

                                                                                                      FIN

                                                                                                      server

                                                                                                      ACK

                                                                                                      ACK

                                                                                                      FIN

                                                                                                      close

                                                                                                      close

                                                                                                      closed

                                                                                                      tim

                                                                                                      ed w

                                                                                                      ait

                                                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                                                      TCP Connection Management (cont)

                                                                                                      Step 3 client receives FIN replies with ACK

                                                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                      Closes down after timed-wait

                                                                                                      Step 4 server receives ACK Connection closed

                                                                                                      Note with small modification can handle simultaneous FINs

                                                                                                      client

                                                                                                      FIN

                                                                                                      server

                                                                                                      ACK

                                                                                                      ACK

                                                                                                      FIN

                                                                                                      closing

                                                                                                      closing

                                                                                                      closed

                                                                                                      tim

                                                                                                      ed w

                                                                                                      ait

                                                                                                      closed

                                                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                                                      TCP Connection Management (cont)

                                                                                                      ExampleTCP serverlifecycle

                                                                                                      Example TCP clientlifecycle

                                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                                      A few special cases

                                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                                      Chapter 3 outline

                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                      35 Connection-oriented transport TCP

                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                                      Principles of Congestion Control

                                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                      a top-10 problem

                                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                      large delays when congestedmaximum achievable throughput

                                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                                      Causescosts of congestion scenario 2

                                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                      λin λout=

                                                                                                      λin λoutgtλ

                                                                                                      inλout

                                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                      (c)(a) (b)

                                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                      λin

                                                                                                      Q what happens as and increase λ

                                                                                                      in

                                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                                      Causescosts of congestion scenario 3

                                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                      Approaches towards congestion control

                                                                                                      Two broad approaches towards congestion control

                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                      Case study ATM ABR congestion control

                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                      small exception ndash see next page

                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                      sender should use available bandwidth

                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                      Case study ATM ABR congestion control

                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                      Chapter 3 outline

                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                      35 Connection-oriented transport TCP

                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                      Congwin

                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                      cut CongWin in half after loss event

                                                                                                      8 Kbytes

                                                                                                      16 Kbytes

                                                                                                      24 Kbytes

                                                                                                      time

                                                                                                      congestionwindow

                                                                                                      Long-lived TCP connection

                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                      TCP Slow Start

                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                      TCP Slow Start (more)

                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                      Host A

                                                                                                      one segment

                                                                                                      RTT

                                                                                                      Host B

                                                                                                      time

                                                                                                      two segments

                                                                                                      four segments

                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                      Summary TCP Congestion Control

                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                      The Big Picture

                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                      ACK receipt for previously unackeddata

                                                                                                      Slow Start (SS)

                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                      ACK receipt for previously unackeddata

                                                                                                      CongestionAvoidance (CA)

                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                      Enter slow start

                                                                                                      Duplicate ACK

                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                      CongWin and Threshold not changed

                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                      TCP throughput

                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                      TCP Futures

                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                      LRTTMSSsdot221

                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                      TCP connection 1

                                                                                                      bottleneckrouter

                                                                                                      capacity R

                                                                                                      TCP connection 2

                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                      R

                                                                                                      R

                                                                                                      equal bandwidth share

                                                                                                      Connection 1 throughput

                                                                                                      Conn

                                                                                                      ecti

                                                                                                      on 2

                                                                                                      thr

                                                                                                      ough

                                                                                                      p ut

                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                      Fairness (more)Fairness and UDP

                                                                                                      Multimedia apps often do not use TCP

                                                                                                      do not want rate throttled by congestion control

                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                      modeling slow start

                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                      Fixed congestion window (1)

                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                      latency = 2RTT + OR

                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                      Fixed congestion window (2)

                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                      Will show that the delay for one object is

                                                                                                      RS

                                                                                                      RSRTTP

                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                      - and K is the number of windows that cover the object

                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                      RTT

                                                                                                      initiate TCPconnection

                                                                                                      requestobject

                                                                                                      first window= SR

                                                                                                      second window= 2SR

                                                                                                      third window= 4SR

                                                                                                      fourth window= 8SR

                                                                                                      completetransmissionobject

                                                                                                      delivered

                                                                                                      time atclient

                                                                                                      time atserver

                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                      Server idles P=2 times

                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                      Server idles P = minK-1Q times

                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                      TCP Latency Modeling (3)

                                                                                                      ementacknowledg receivesserver until

                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                      RS

                                                                                                      RSRTTPRTT

                                                                                                      RO

                                                                                                      RSRTT

                                                                                                      RSRTT

                                                                                                      RO

                                                                                                      idleTimeRTTRO

                                                                                                      P

                                                                                                      kP

                                                                                                      k

                                                                                                      P

                                                                                                      pp

                                                                                                      )12(][2

                                                                                                      ]2[2

                                                                                                      2delay

                                                                                                      1

                                                                                                      1

                                                                                                      1

                                                                                                      minusminus+++=

                                                                                                      minus+++=

                                                                                                      ++=

                                                                                                      minus

                                                                                                      =

                                                                                                      =

                                                                                                      sum

                                                                                                      sum

                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                      RS k =⎥⎦

                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                      +minus

                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                      RSk

                                                                                                      RTT

                                                                                                      initiate TCPconnection

                                                                                                      requestobject

                                                                                                      first window= SR

                                                                                                      second window= 2SR

                                                                                                      third window= 4SR

                                                                                                      fourth window= 8SR

                                                                                                      completetransmissionobject

                                                                                                      delivered

                                                                                                      time atclient

                                                                                                      time atserver

                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                      How do we calculate K

                                                                                                      ⎥⎥⎤

                                                                                                      ⎢⎢⎡ +=

                                                                                                      +ge=

                                                                                                      geminus=

                                                                                                      ge+++=

                                                                                                      ge+++=minus

                                                                                                      minus

                                                                                                      )1(log

                                                                                                      )1(logmin

                                                                                                      12min

                                                                                                      222min222min

                                                                                                      2

                                                                                                      2

                                                                                                      110

                                                                                                      110

                                                                                                      SO

                                                                                                      SOkk

                                                                                                      SOk

                                                                                                      SOkOSSSkK

                                                                                                      k

                                                                                                      k

                                                                                                      k

                                                                                                      L

                                                                                                      L

                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                      02468

                                                                                                      101214161820

                                                                                                      28Kbps

                                                                                                      100Kbps

                                                                                                      1 Mbps 10Mbps

                                                                                                      non-persistent

                                                                                                      persistent

                                                                                                      parallel non-persistent

                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                      HTTP Response time (in seconds)

                                                                                                      0

                                                                                                      10

                                                                                                      20

                                                                                                      30

                                                                                                      40

                                                                                                      50

                                                                                                      60

                                                                                                      70

                                                                                                      28Kbps

                                                                                                      100Kbps

                                                                                                      1 Mbps 10Mbps

                                                                                                      non-persistent

                                                                                                      persistent

                                                                                                      parallel non-persistent

                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                      instantiation and implementation in the Internet

                                                                                                      UDPTCP

                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                      • Chapter 3 outline
                                                                                                      • Transport services and protocols
                                                                                                      • Transport vs network layer
                                                                                                      • Transport-layer protocols
                                                                                                      • Chapter 3 outline
                                                                                                      • Multiplexingdemultiplexing
                                                                                                      • Multiplexingdemultiplexing
                                                                                                      • How demultiplexing works
                                                                                                      • Connectionless demultiplexing
                                                                                                      • Connectionless demux (cont)
                                                                                                      • Connection-oriented demux
                                                                                                      • Connection-oriented demux (cont)
                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                      • Chapter 3 outline
                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                      • UDP more
                                                                                                      • UDP checksum
                                                                                                      • Chapter 3 outline
                                                                                                      • Principles of Reliable data transfer
                                                                                                      • Reliable data transfer getting started
                                                                                                      • Reliable data transfer getting started
                                                                                                      • Incremental Improvements
                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                      • Rdt20 channel with bit errors
                                                                                                      • rdt20 FSM specification
                                                                                                      • rdt20 operation with no errors
                                                                                                      • rdt20 error scenario
                                                                                                      • rdt20 has a fatal flaw
                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                      • rdt21 discussion
                                                                                                      • rdt22 a NAK-free protocol
                                                                                                      • rdt22 sender receiver fragments
                                                                                                      • rdt30 channels with errors and loss
                                                                                                      • rdt30 sender
                                                                                                      • rdt30 in action
                                                                                                      • rdt30 in action
                                                                                                      • Performance of rdt30
                                                                                                      • rdt30 stop-and-wait operation
                                                                                                      • Pipelined protocols
                                                                                                      • Pipelined protocols
                                                                                                      • Pipelining increased utilization
                                                                                                      • Go-Back-N
                                                                                                      • GBN Sender
                                                                                                      • GBN sender extended FSM
                                                                                                      • GBN receiver extended FSM
                                                                                                      • More on receiver
                                                                                                      • GBN inaction
                                                                                                      • Selective Repeat
                                                                                                      • Selective repeat sender receiver windows
                                                                                                      • Selective repeat
                                                                                                      • Selective repeat in action
                                                                                                      • Selective repeat dilemma
                                                                                                      • Chapter 3 outline
                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                      • More TCP Details
                                                                                                      • Even More TCP Details
                                                                                                      • TCP segment structure
                                                                                                      • TCP seq rsquos and ACKs
                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                      • Example RTT estimation
                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                      • Chapter 3 outline
                                                                                                      • TCP reliable data transfer
                                                                                                      • TCP sender events
                                                                                                      • TCP sender(simplified)
                                                                                                      • TCP retransmission scenarios
                                                                                                      • TCP retransmission scenarios (more)
                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                      • More on Sender Policies
                                                                                                      • Fast Retransmit
                                                                                                      • Fast retransmit algorithm
                                                                                                      • TCP GBN or Selective Repeat
                                                                                                      • Chapter 3 outline
                                                                                                      • TCP Flow Control
                                                                                                      • TCP Flow Control
                                                                                                      • TCP segment structure
                                                                                                      • TCP Flow control how it works
                                                                                                      • Technical Issue
                                                                                                      • Chapter 3 outline
                                                                                                      • TCP Connection Management
                                                                                                      • TCP Connection Management (cont)
                                                                                                      • TCP Connection Management (cont)
                                                                                                      • TCP Connection Management (cont)
                                                                                                      • TCP Connection Management (cont)
                                                                                                      • A few special cases
                                                                                                      • Chapter 3 outline
                                                                                                      • Principles of Congestion Control
                                                                                                      • Causescosts of congestion scenario 1
                                                                                                      • Causescosts of congestion scenario 2
                                                                                                      • Causescosts of congestion scenario 3
                                                                                                      • Causescosts of congestion scenario 3
                                                                                                      • Approaches towards congestion control
                                                                                                      • Case study ATM ABR congestion control
                                                                                                      • Case study ATM ABR congestion control
                                                                                                      • Chapter 3 outline
                                                                                                      • TCP Congestion Control
                                                                                                      • TCP AIMD
                                                                                                      • TCP Slow Start
                                                                                                      • TCP Slow Start (more)
                                                                                                      • Summary TCP Congestion Control
                                                                                                      • The Big Picture
                                                                                                      • TCP sender congestion control
                                                                                                      • TCP throughput
                                                                                                      • TCP Futures
                                                                                                      • TCP Fairness
                                                                                                      • Why is TCP fair
                                                                                                      • Fairness (more)
                                                                                                      • TCP Latency Modeling
                                                                                                      • Fixed Congestion Window (W)
                                                                                                      • Fixed congestion window (1)
                                                                                                      • Fixed congestion window (2)
                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                      • TCP Latency Modeling (3)
                                                                                                      • TCP Latency Modeling (4)
                                                                                                      • HTTP Modeling
                                                                                                      • Chapter 3 Summary

                                                                                                        3 Transport Layer 52Comp 361 Spring 2005

                                                                                                        Selective Repeat

                                                                                                        receiver individually acknowledges all correctly received pkts

                                                                                                        buffers pkts as needed for eventual in-order delivery to upper layer

                                                                                                        sender only resends pkts for which ACK not received

                                                                                                        sender timer for each unACKed pktCompare to GBN which only had timer for base packet

                                                                                                        sender windowN consecutive seq rsquosagain limits seq s of sent unACKed pktsImportant Window size lt seq range

                                                                                                        3 Transport Layer 53Comp 361 Spring 2005

                                                                                                        Selective repeat sender receiver windows

                                                                                                        3 Transport Layer 54Comp 361 Spring 2005

                                                                                                        Selective repeat

                                                                                                        pkt n in [rcvbase rcvbase+N-1]

                                                                                                        send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                        pkt n in [rcvbase-Nrcvbase-1]

                                                                                                        ACK(n) (note this is a reACK)

                                                                                                        otherwiseignore

                                                                                                        receiverdata from above

                                                                                                        if next available seq in window send pkt

                                                                                                        timeout(n)resend pkt n restart timer

                                                                                                        ACK(n) in [sendbasesendbase+N]

                                                                                                        mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                        sender

                                                                                                        3 Transport Layer 55Comp 361 Spring 2005

                                                                                                        Selective repeat in action

                                                                                                        3 Transport Layer 56Comp 361 Spring 2005

                                                                                                        Selective repeatdilemma

                                                                                                        Example seq rsquos 0 1 2 3window size=3

                                                                                                        receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                        Q what is relationship between seq size and window size

                                                                                                        3 Transport Layer 57Comp 361 Spring 2005

                                                                                                        Chapter 3 outline

                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                        35 Connection-oriented transport TCP

                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                        3 Transport Layer 58Comp 361 Spring 2005

                                                                                                        TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                        full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                        connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                        flow controlledsender will not overwhelm receiver

                                                                                                        point-to-pointone sender one receiver

                                                                                                        reliable in-order byte steam

                                                                                                        no ldquomessage boundariesrdquopipelined

                                                                                                        TCP congestion and flow control set window size

                                                                                                        send amp receive buffers

                                                                                                        socketdoor

                                                                                                        TCPsend buffer

                                                                                                        TCPreceive buffer

                                                                                                        socketdoor

                                                                                                        segment

                                                                                                        applicationwrites data

                                                                                                        applicationreads data

                                                                                                        3 Transport Layer 59Comp 361 Spring 2005

                                                                                                        More TCP DetailsMaximum Segment Size (MSS)

                                                                                                        Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                        Application Data + TCP Header = TCP Segment

                                                                                                        Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                        (again no payload)Client responds with third special segment

                                                                                                        This can contain payload

                                                                                                        3 Transport Layer 60Comp 361 Spring 2005

                                                                                                        Even More TCP Details

                                                                                                        A TCP connection between client and server creates in both client and server

                                                                                                        (i) buffers(ii) variables and

                                                                                                        (iii) a socket connection to process

                                                                                                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                        any of the network elements between the host and server

                                                                                                        3 Transport Layer 61Comp 361 Spring 2005

                                                                                                        TCP segment structure

                                                                                                        source port dest port

                                                                                                        32 bits

                                                                                                        applicationdata

                                                                                                        (variable length)

                                                                                                        sequence numberacknowledgement number

                                                                                                        Receive windowUrg data pnterchecksum

                                                                                                        FSRPAUheadlen

                                                                                                        notused

                                                                                                        Options (variable length)

                                                                                                        URG urgent data (generally not used)

                                                                                                        ACK ACK valid

                                                                                                        PSH push data now(generally not used)

                                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                                        commands)

                                                                                                        bytes rcvr willingto accept

                                                                                                        Internetchecksum

                                                                                                        (as in UDP)

                                                                                                        countingby bytes of data(not segments)

                                                                                                        3 Transport Layer 62Comp 361 Spring 2005

                                                                                                        TCP seq rsquos and ACKsSeq rsquos

                                                                                                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                        ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                        Q how receiver handles out-of-order segments

                                                                                                        A TCP spec doesnrsquot say - up to implementer

                                                                                                        Host BHost A

                                                                                                        Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                        Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                        Seq=43 ACK=80

                                                                                                        Usertypes

                                                                                                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                        back lsquoCrsquo

                                                                                                        host ACKsreceipt

                                                                                                        of echoedlsquoCrsquo

                                                                                                        timesimple telnet scenario

                                                                                                        3 Transport Layer 63Comp 361 Spring 2005

                                                                                                        TCP Round Trip Time and Timeout

                                                                                                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                        average several recent measurements not just current SampleRTT

                                                                                                        Q how to set TCP timeout valuelonger than RTT

                                                                                                        but RTT variestoo short premature timeout

                                                                                                        unnecessary retransmissions

                                                                                                        too long slow reaction to segment loss

                                                                                                        3 Transport Layer 64Comp 361 Spring 2005

                                                                                                        TCP Round Trip Time and Timeout

                                                                                                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                        3 Transport Layer 65Comp 361 Spring 2005

                                                                                                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                        100

                                                                                                        150

                                                                                                        200

                                                                                                        250

                                                                                                        300

                                                                                                        350

                                                                                                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                        time (seconnds)

                                                                                                        RTT

                                                                                                        (mill

                                                                                                        iseco

                                                                                                        nds)

                                                                                                        SampleRTT Estimated RTT

                                                                                                        3 Transport Layer 66Comp 361 Spring 2005

                                                                                                        TCP Round Trip Time and Timeout

                                                                                                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                        (typically β = 025)

                                                                                                        Then set timeout interval

                                                                                                        TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                        3 Transport Layer 67Comp 361 Spring 2005

                                                                                                        Chapter 3 outline

                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                        35 Connection-oriented transport TCP

                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                        3 Transport Layer 68Comp 361 Spring 2005

                                                                                                        TCP reliable data transfer

                                                                                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                        Retransmissions are triggered by

                                                                                                        timeout eventsduplicate acks

                                                                                                        Initially consider simplified TCP sender

                                                                                                        ignore duplicate acksignore flow control congestion control

                                                                                                        3 Transport Layer 69Comp 361 Spring 2005

                                                                                                        TCP sender eventsdata rcvd from app

                                                                                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                        timeoutretransmit segment that caused timeoutrestart timer

                                                                                                        Ack rcvdIf acknowledges previously unackedsegments

                                                                                                        update what is known to be ackedstart timer if there are outstanding segments

                                                                                                        TCP sender(simplified)

                                                                                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                        loop (forever) switch(event)

                                                                                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                        smallest sequence numberstart timer

                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                        start timer

                                                                                                        end of loop forever

                                                                                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                        3 Transport Layer 70Comp 361 Spring 2005

                                                                                                        3 Transport Layer 71Comp 361 Spring 2005

                                                                                                        TCP retransmission scenariosHost A

                                                                                                        Seq=100 20 bytes data

                                                                                                        ACK=100

                                                                                                        timepremature timeout

                                                                                                        Host B

                                                                                                        Seq=92 8 bytes data

                                                                                                        ACK=120

                                                                                                        Seq=92 8 bytes data

                                                                                                        Seq=

                                                                                                        92 t

                                                                                                        imeo

                                                                                                        ut

                                                                                                        ACK=120

                                                                                                        Host A

                                                                                                        Seq=92 8 bytes data

                                                                                                        ACK=100

                                                                                                        loss

                                                                                                        tim

                                                                                                        eout

                                                                                                        lost ACK scenario

                                                                                                        Host B

                                                                                                        X

                                                                                                        Seq=92 8 bytes data

                                                                                                        ACK=100

                                                                                                        time

                                                                                                        SendBase= 120

                                                                                                        SendBase= 120

                                                                                                        Sendbase= 100

                                                                                                        Seq=

                                                                                                        92 t

                                                                                                        imeo

                                                                                                        utSendBase

                                                                                                        = 100

                                                                                                        3 Transport Layer 72Comp 361 Spring 2005

                                                                                                        TCP retransmission scenarios (more)Host A

                                                                                                        Seq=92 8 bytes data

                                                                                                        ACK=100

                                                                                                        loss

                                                                                                        tim

                                                                                                        eout

                                                                                                        Cumulative ACK scenario

                                                                                                        Host B

                                                                                                        X

                                                                                                        Seq=100 20 bytes data

                                                                                                        ACK=120

                                                                                                        time

                                                                                                        SendBase= 120

                                                                                                        3 Transport Layer 73Comp 361 Spring 2005

                                                                                                        TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                        Event at Receiver

                                                                                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                        Arrival of segment that partially or completely fills gap

                                                                                                        TCP Receiver action

                                                                                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                        Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                        Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                        3 Transport Layer 74Comp 361 Spring 2005

                                                                                                        More on Sender Policies

                                                                                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                        3 Transport Layer 75Comp 361 Spring 2005

                                                                                                        Fast Retransmit

                                                                                                        Time-out period often relatively long

                                                                                                        long delay before resending lost packet

                                                                                                        Detect lost segments via duplicate ACKs

                                                                                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                        fast retransmit resend segment before timer expires

                                                                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                                                                        Fast retransmit algorithm

                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                        start timer

                                                                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                        resend segment with sequence number y

                                                                                                        a duplicate ACK for already ACKed segment

                                                                                                        fast retransmit

                                                                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                                                                        TCP GBN or Selective Repeat

                                                                                                        Basic TCP looks a lot like GBN

                                                                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                        This looks a lot like Selective Repeat

                                                                                                        TCP is a hybrid

                                                                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                                                                        Chapter 3 outline

                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                        35 Connection-oriented transport TCP

                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                                                                        TCP Flow Control

                                                                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                        transmitting too muchtoo fast

                                                                                                        flow controlreceive side of TCP connection has a receive buffer

                                                                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                        app process may be slow at reading from buffer

                                                                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                                                                        TCP segment structure

                                                                                                        source port dest port

                                                                                                        32 bits

                                                                                                        applicationdata

                                                                                                        (variable length)

                                                                                                        sequence numberacknowledgement number

                                                                                                        Receive windowUrg data pnterchecksum

                                                                                                        FSRPAUheadlen

                                                                                                        notused

                                                                                                        Options (variable length)

                                                                                                        URG urgent data (generally not used)

                                                                                                        ACK ACK valid

                                                                                                        PSH push data now(generally not used)

                                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                                        commands)

                                                                                                        bytes rcvr willingto accept

                                                                                                        Internetchecksum

                                                                                                        (as in UDP)

                                                                                                        countingby bytes of data(not segments)

                                                                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                                                                        TCP Flow control how it works

                                                                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                        LastByteRead]

                                                                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                        guarantees receive buffer doesnrsquot overflow

                                                                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                                                                        Technical Issue

                                                                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                                                        Note on UDP

                                                                                                        UDP has no flow control

                                                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                                                        Chapter 3 outline

                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                        35 Connection-oriented transport TCP

                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                                                        TCP Connection Management

                                                                                                        Three way handshakeStep 1 client end system sends

                                                                                                        TCP SYN control segment to server

                                                                                                        specifies client_isn the initial seq No application data

                                                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                                                        TCP Connection Management (cont)

                                                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                        Allocate buffersAllocates buffersCan include application data

                                                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                                                        server

                                                                                                        Connection granted (SYN=1 server_isn

                                                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                                                        ack=client_isn+1)

                                                                                                        ack=server_isn+1

                                                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                                                        TCP Connection Management (cont)

                                                                                                        Closing a connection

                                                                                                        client closes socketclientSocketclose()

                                                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                        client

                                                                                                        FIN

                                                                                                        server

                                                                                                        ACK

                                                                                                        ACK

                                                                                                        FIN

                                                                                                        close

                                                                                                        close

                                                                                                        closed

                                                                                                        tim

                                                                                                        ed w

                                                                                                        ait

                                                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                                                        TCP Connection Management (cont)

                                                                                                        Step 3 client receives FIN replies with ACK

                                                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                        Closes down after timed-wait

                                                                                                        Step 4 server receives ACK Connection closed

                                                                                                        Note with small modification can handle simultaneous FINs

                                                                                                        client

                                                                                                        FIN

                                                                                                        server

                                                                                                        ACK

                                                                                                        ACK

                                                                                                        FIN

                                                                                                        closing

                                                                                                        closing

                                                                                                        closed

                                                                                                        tim

                                                                                                        ed w

                                                                                                        ait

                                                                                                        closed

                                                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                                                        TCP Connection Management (cont)

                                                                                                        ExampleTCP serverlifecycle

                                                                                                        Example TCP clientlifecycle

                                                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                                                        A few special cases

                                                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                                        Chapter 3 outline

                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                        35 Connection-oriented transport TCP

                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                                        Principles of Congestion Control

                                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                        a top-10 problem

                                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                        large delays when congestedmaximum achievable throughput

                                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                                        Causescosts of congestion scenario 2

                                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                        λin λout=

                                                                                                        λin λoutgtλ

                                                                                                        inλout

                                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                        (c)(a) (b)

                                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                        λin

                                                                                                        Q what happens as and increase λ

                                                                                                        in

                                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                                        Causescosts of congestion scenario 3

                                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                                        Approaches towards congestion control

                                                                                                        Two broad approaches towards congestion control

                                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                        Case study ATM ABR congestion control

                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                        small exception ndash see next page

                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                        sender should use available bandwidth

                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                        Case study ATM ABR congestion control

                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                        Chapter 3 outline

                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                        35 Connection-oriented transport TCP

                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                        Congwin

                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                        cut CongWin in half after loss event

                                                                                                        8 Kbytes

                                                                                                        16 Kbytes

                                                                                                        24 Kbytes

                                                                                                        time

                                                                                                        congestionwindow

                                                                                                        Long-lived TCP connection

                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                        TCP Slow Start

                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                        TCP Slow Start (more)

                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                        Host A

                                                                                                        one segment

                                                                                                        RTT

                                                                                                        Host B

                                                                                                        time

                                                                                                        two segments

                                                                                                        four segments

                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                        Summary TCP Congestion Control

                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                        The Big Picture

                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                        ACK receipt for previously unackeddata

                                                                                                        Slow Start (SS)

                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                        ACK receipt for previously unackeddata

                                                                                                        CongestionAvoidance (CA)

                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                        Enter slow start

                                                                                                        Duplicate ACK

                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                        CongWin and Threshold not changed

                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                        TCP throughput

                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                        TCP Futures

                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                        LRTTMSSsdot221

                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                        TCP connection 1

                                                                                                        bottleneckrouter

                                                                                                        capacity R

                                                                                                        TCP connection 2

                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                        R

                                                                                                        R

                                                                                                        equal bandwidth share

                                                                                                        Connection 1 throughput

                                                                                                        Conn

                                                                                                        ecti

                                                                                                        on 2

                                                                                                        thr

                                                                                                        ough

                                                                                                        p ut

                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                        Fairness (more)Fairness and UDP

                                                                                                        Multimedia apps often do not use TCP

                                                                                                        do not want rate throttled by congestion control

                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                        modeling slow start

                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                        Fixed congestion window (1)

                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                        latency = 2RTT + OR

                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                        Fixed congestion window (2)

                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                        Will show that the delay for one object is

                                                                                                        RS

                                                                                                        RSRTTP

                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                        - and K is the number of windows that cover the object

                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                        RTT

                                                                                                        initiate TCPconnection

                                                                                                        requestobject

                                                                                                        first window= SR

                                                                                                        second window= 2SR

                                                                                                        third window= 4SR

                                                                                                        fourth window= 8SR

                                                                                                        completetransmissionobject

                                                                                                        delivered

                                                                                                        time atclient

                                                                                                        time atserver

                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                        Server idles P=2 times

                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                        Server idles P = minK-1Q times

                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                        TCP Latency Modeling (3)

                                                                                                        ementacknowledg receivesserver until

                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                        RS

                                                                                                        RSRTTPRTT

                                                                                                        RO

                                                                                                        RSRTT

                                                                                                        RSRTT

                                                                                                        RO

                                                                                                        idleTimeRTTRO

                                                                                                        P

                                                                                                        kP

                                                                                                        k

                                                                                                        P

                                                                                                        pp

                                                                                                        )12(][2

                                                                                                        ]2[2

                                                                                                        2delay

                                                                                                        1

                                                                                                        1

                                                                                                        1

                                                                                                        minusminus+++=

                                                                                                        minus+++=

                                                                                                        ++=

                                                                                                        minus

                                                                                                        =

                                                                                                        =

                                                                                                        sum

                                                                                                        sum

                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                        RS k =⎥⎦

                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                        +minus

                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                        RSk

                                                                                                        RTT

                                                                                                        initiate TCPconnection

                                                                                                        requestobject

                                                                                                        first window= SR

                                                                                                        second window= 2SR

                                                                                                        third window= 4SR

                                                                                                        fourth window= 8SR

                                                                                                        completetransmissionobject

                                                                                                        delivered

                                                                                                        time atclient

                                                                                                        time atserver

                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                        How do we calculate K

                                                                                                        ⎥⎥⎤

                                                                                                        ⎢⎢⎡ +=

                                                                                                        +ge=

                                                                                                        geminus=

                                                                                                        ge+++=

                                                                                                        ge+++=minus

                                                                                                        minus

                                                                                                        )1(log

                                                                                                        )1(logmin

                                                                                                        12min

                                                                                                        222min222min

                                                                                                        2

                                                                                                        2

                                                                                                        110

                                                                                                        110

                                                                                                        SO

                                                                                                        SOkk

                                                                                                        SOk

                                                                                                        SOkOSSSkK

                                                                                                        k

                                                                                                        k

                                                                                                        k

                                                                                                        L

                                                                                                        L

                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                        02468

                                                                                                        101214161820

                                                                                                        28Kbps

                                                                                                        100Kbps

                                                                                                        1 Mbps 10Mbps

                                                                                                        non-persistent

                                                                                                        persistent

                                                                                                        parallel non-persistent

                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                        HTTP Response time (in seconds)

                                                                                                        0

                                                                                                        10

                                                                                                        20

                                                                                                        30

                                                                                                        40

                                                                                                        50

                                                                                                        60

                                                                                                        70

                                                                                                        28Kbps

                                                                                                        100Kbps

                                                                                                        1 Mbps 10Mbps

                                                                                                        non-persistent

                                                                                                        persistent

                                                                                                        parallel non-persistent

                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                        instantiation and implementation in the Internet

                                                                                                        UDPTCP

                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                        • Chapter 3 outline
                                                                                                        • Transport services and protocols
                                                                                                        • Transport vs network layer
                                                                                                        • Transport-layer protocols
                                                                                                        • Chapter 3 outline
                                                                                                        • Multiplexingdemultiplexing
                                                                                                        • Multiplexingdemultiplexing
                                                                                                        • How demultiplexing works
                                                                                                        • Connectionless demultiplexing
                                                                                                        • Connectionless demux (cont)
                                                                                                        • Connection-oriented demux
                                                                                                        • Connection-oriented demux (cont)
                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                        • Chapter 3 outline
                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                        • UDP more
                                                                                                        • UDP checksum
                                                                                                        • Chapter 3 outline
                                                                                                        • Principles of Reliable data transfer
                                                                                                        • Reliable data transfer getting started
                                                                                                        • Reliable data transfer getting started
                                                                                                        • Incremental Improvements
                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                        • Rdt20 channel with bit errors
                                                                                                        • rdt20 FSM specification
                                                                                                        • rdt20 operation with no errors
                                                                                                        • rdt20 error scenario
                                                                                                        • rdt20 has a fatal flaw
                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                        • rdt21 discussion
                                                                                                        • rdt22 a NAK-free protocol
                                                                                                        • rdt22 sender receiver fragments
                                                                                                        • rdt30 channels with errors and loss
                                                                                                        • rdt30 sender
                                                                                                        • rdt30 in action
                                                                                                        • rdt30 in action
                                                                                                        • Performance of rdt30
                                                                                                        • rdt30 stop-and-wait operation
                                                                                                        • Pipelined protocols
                                                                                                        • Pipelined protocols
                                                                                                        • Pipelining increased utilization
                                                                                                        • Go-Back-N
                                                                                                        • GBN Sender
                                                                                                        • GBN sender extended FSM
                                                                                                        • GBN receiver extended FSM
                                                                                                        • More on receiver
                                                                                                        • GBN inaction
                                                                                                        • Selective Repeat
                                                                                                        • Selective repeat sender receiver windows
                                                                                                        • Selective repeat
                                                                                                        • Selective repeat in action
                                                                                                        • Selective repeat dilemma
                                                                                                        • Chapter 3 outline
                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                        • More TCP Details
                                                                                                        • Even More TCP Details
                                                                                                        • TCP segment structure
                                                                                                        • TCP seq rsquos and ACKs
                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                        • Example RTT estimation
                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                        • Chapter 3 outline
                                                                                                        • TCP reliable data transfer
                                                                                                        • TCP sender events
                                                                                                        • TCP sender(simplified)
                                                                                                        • TCP retransmission scenarios
                                                                                                        • TCP retransmission scenarios (more)
                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                        • More on Sender Policies
                                                                                                        • Fast Retransmit
                                                                                                        • Fast retransmit algorithm
                                                                                                        • TCP GBN or Selective Repeat
                                                                                                        • Chapter 3 outline
                                                                                                        • TCP Flow Control
                                                                                                        • TCP Flow Control
                                                                                                        • TCP segment structure
                                                                                                        • TCP Flow control how it works
                                                                                                        • Technical Issue
                                                                                                        • Chapter 3 outline
                                                                                                        • TCP Connection Management
                                                                                                        • TCP Connection Management (cont)
                                                                                                        • TCP Connection Management (cont)
                                                                                                        • TCP Connection Management (cont)
                                                                                                        • TCP Connection Management (cont)
                                                                                                        • A few special cases
                                                                                                        • Chapter 3 outline
                                                                                                        • Principles of Congestion Control
                                                                                                        • Causescosts of congestion scenario 1
                                                                                                        • Causescosts of congestion scenario 2
                                                                                                        • Causescosts of congestion scenario 3
                                                                                                        • Causescosts of congestion scenario 3
                                                                                                        • Approaches towards congestion control
                                                                                                        • Case study ATM ABR congestion control
                                                                                                        • Case study ATM ABR congestion control
                                                                                                        • Chapter 3 outline
                                                                                                        • TCP Congestion Control
                                                                                                        • TCP AIMD
                                                                                                        • TCP Slow Start
                                                                                                        • TCP Slow Start (more)
                                                                                                        • Summary TCP Congestion Control
                                                                                                        • The Big Picture
                                                                                                        • TCP sender congestion control
                                                                                                        • TCP throughput
                                                                                                        • TCP Futures
                                                                                                        • TCP Fairness
                                                                                                        • Why is TCP fair
                                                                                                        • Fairness (more)
                                                                                                        • TCP Latency Modeling
                                                                                                        • Fixed Congestion Window (W)
                                                                                                        • Fixed congestion window (1)
                                                                                                        • Fixed congestion window (2)
                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                        • TCP Latency Modeling (3)
                                                                                                        • TCP Latency Modeling (4)
                                                                                                        • HTTP Modeling
                                                                                                        • Chapter 3 Summary

                                                                                                          3 Transport Layer 53Comp 361 Spring 2005

                                                                                                          Selective repeat sender receiver windows

                                                                                                          3 Transport Layer 54Comp 361 Spring 2005

                                                                                                          Selective repeat

                                                                                                          pkt n in [rcvbase rcvbase+N-1]

                                                                                                          send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                          pkt n in [rcvbase-Nrcvbase-1]

                                                                                                          ACK(n) (note this is a reACK)

                                                                                                          otherwiseignore

                                                                                                          receiverdata from above

                                                                                                          if next available seq in window send pkt

                                                                                                          timeout(n)resend pkt n restart timer

                                                                                                          ACK(n) in [sendbasesendbase+N]

                                                                                                          mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                          sender

                                                                                                          3 Transport Layer 55Comp 361 Spring 2005

                                                                                                          Selective repeat in action

                                                                                                          3 Transport Layer 56Comp 361 Spring 2005

                                                                                                          Selective repeatdilemma

                                                                                                          Example seq rsquos 0 1 2 3window size=3

                                                                                                          receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                          Q what is relationship between seq size and window size

                                                                                                          3 Transport Layer 57Comp 361 Spring 2005

                                                                                                          Chapter 3 outline

                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                          35 Connection-oriented transport TCP

                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                          3 Transport Layer 58Comp 361 Spring 2005

                                                                                                          TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                          full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                          connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                          flow controlledsender will not overwhelm receiver

                                                                                                          point-to-pointone sender one receiver

                                                                                                          reliable in-order byte steam

                                                                                                          no ldquomessage boundariesrdquopipelined

                                                                                                          TCP congestion and flow control set window size

                                                                                                          send amp receive buffers

                                                                                                          socketdoor

                                                                                                          TCPsend buffer

                                                                                                          TCPreceive buffer

                                                                                                          socketdoor

                                                                                                          segment

                                                                                                          applicationwrites data

                                                                                                          applicationreads data

                                                                                                          3 Transport Layer 59Comp 361 Spring 2005

                                                                                                          More TCP DetailsMaximum Segment Size (MSS)

                                                                                                          Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                          Application Data + TCP Header = TCP Segment

                                                                                                          Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                          (again no payload)Client responds with third special segment

                                                                                                          This can contain payload

                                                                                                          3 Transport Layer 60Comp 361 Spring 2005

                                                                                                          Even More TCP Details

                                                                                                          A TCP connection between client and server creates in both client and server

                                                                                                          (i) buffers(ii) variables and

                                                                                                          (iii) a socket connection to process

                                                                                                          TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                          any of the network elements between the host and server

                                                                                                          3 Transport Layer 61Comp 361 Spring 2005

                                                                                                          TCP segment structure

                                                                                                          source port dest port

                                                                                                          32 bits

                                                                                                          applicationdata

                                                                                                          (variable length)

                                                                                                          sequence numberacknowledgement number

                                                                                                          Receive windowUrg data pnterchecksum

                                                                                                          FSRPAUheadlen

                                                                                                          notused

                                                                                                          Options (variable length)

                                                                                                          URG urgent data (generally not used)

                                                                                                          ACK ACK valid

                                                                                                          PSH push data now(generally not used)

                                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                                          commands)

                                                                                                          bytes rcvr willingto accept

                                                                                                          Internetchecksum

                                                                                                          (as in UDP)

                                                                                                          countingby bytes of data(not segments)

                                                                                                          3 Transport Layer 62Comp 361 Spring 2005

                                                                                                          TCP seq rsquos and ACKsSeq rsquos

                                                                                                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                          ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                          Q how receiver handles out-of-order segments

                                                                                                          A TCP spec doesnrsquot say - up to implementer

                                                                                                          Host BHost A

                                                                                                          Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                          Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                          Seq=43 ACK=80

                                                                                                          Usertypes

                                                                                                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                          back lsquoCrsquo

                                                                                                          host ACKsreceipt

                                                                                                          of echoedlsquoCrsquo

                                                                                                          timesimple telnet scenario

                                                                                                          3 Transport Layer 63Comp 361 Spring 2005

                                                                                                          TCP Round Trip Time and Timeout

                                                                                                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                          average several recent measurements not just current SampleRTT

                                                                                                          Q how to set TCP timeout valuelonger than RTT

                                                                                                          but RTT variestoo short premature timeout

                                                                                                          unnecessary retransmissions

                                                                                                          too long slow reaction to segment loss

                                                                                                          3 Transport Layer 64Comp 361 Spring 2005

                                                                                                          TCP Round Trip Time and Timeout

                                                                                                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                          3 Transport Layer 65Comp 361 Spring 2005

                                                                                                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                          100

                                                                                                          150

                                                                                                          200

                                                                                                          250

                                                                                                          300

                                                                                                          350

                                                                                                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                          time (seconnds)

                                                                                                          RTT

                                                                                                          (mill

                                                                                                          iseco

                                                                                                          nds)

                                                                                                          SampleRTT Estimated RTT

                                                                                                          3 Transport Layer 66Comp 361 Spring 2005

                                                                                                          TCP Round Trip Time and Timeout

                                                                                                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                          (typically β = 025)

                                                                                                          Then set timeout interval

                                                                                                          TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                          3 Transport Layer 67Comp 361 Spring 2005

                                                                                                          Chapter 3 outline

                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                          35 Connection-oriented transport TCP

                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                          3 Transport Layer 68Comp 361 Spring 2005

                                                                                                          TCP reliable data transfer

                                                                                                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                          Retransmissions are triggered by

                                                                                                          timeout eventsduplicate acks

                                                                                                          Initially consider simplified TCP sender

                                                                                                          ignore duplicate acksignore flow control congestion control

                                                                                                          3 Transport Layer 69Comp 361 Spring 2005

                                                                                                          TCP sender eventsdata rcvd from app

                                                                                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                          timeoutretransmit segment that caused timeoutrestart timer

                                                                                                          Ack rcvdIf acknowledges previously unackedsegments

                                                                                                          update what is known to be ackedstart timer if there are outstanding segments

                                                                                                          TCP sender(simplified)

                                                                                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                          loop (forever) switch(event)

                                                                                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                          smallest sequence numberstart timer

                                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                          start timer

                                                                                                          end of loop forever

                                                                                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                          3 Transport Layer 70Comp 361 Spring 2005

                                                                                                          3 Transport Layer 71Comp 361 Spring 2005

                                                                                                          TCP retransmission scenariosHost A

                                                                                                          Seq=100 20 bytes data

                                                                                                          ACK=100

                                                                                                          timepremature timeout

                                                                                                          Host B

                                                                                                          Seq=92 8 bytes data

                                                                                                          ACK=120

                                                                                                          Seq=92 8 bytes data

                                                                                                          Seq=

                                                                                                          92 t

                                                                                                          imeo

                                                                                                          ut

                                                                                                          ACK=120

                                                                                                          Host A

                                                                                                          Seq=92 8 bytes data

                                                                                                          ACK=100

                                                                                                          loss

                                                                                                          tim

                                                                                                          eout

                                                                                                          lost ACK scenario

                                                                                                          Host B

                                                                                                          X

                                                                                                          Seq=92 8 bytes data

                                                                                                          ACK=100

                                                                                                          time

                                                                                                          SendBase= 120

                                                                                                          SendBase= 120

                                                                                                          Sendbase= 100

                                                                                                          Seq=

                                                                                                          92 t

                                                                                                          imeo

                                                                                                          utSendBase

                                                                                                          = 100

                                                                                                          3 Transport Layer 72Comp 361 Spring 2005

                                                                                                          TCP retransmission scenarios (more)Host A

                                                                                                          Seq=92 8 bytes data

                                                                                                          ACK=100

                                                                                                          loss

                                                                                                          tim

                                                                                                          eout

                                                                                                          Cumulative ACK scenario

                                                                                                          Host B

                                                                                                          X

                                                                                                          Seq=100 20 bytes data

                                                                                                          ACK=120

                                                                                                          time

                                                                                                          SendBase= 120

                                                                                                          3 Transport Layer 73Comp 361 Spring 2005

                                                                                                          TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                          Event at Receiver

                                                                                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                          Arrival of segment that partially or completely fills gap

                                                                                                          TCP Receiver action

                                                                                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                          Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                          Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                          3 Transport Layer 74Comp 361 Spring 2005

                                                                                                          More on Sender Policies

                                                                                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                          3 Transport Layer 75Comp 361 Spring 2005

                                                                                                          Fast Retransmit

                                                                                                          Time-out period often relatively long

                                                                                                          long delay before resending lost packet

                                                                                                          Detect lost segments via duplicate ACKs

                                                                                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                          fast retransmit resend segment before timer expires

                                                                                                          3 Transport Layer 76Comp 361 Spring 2005

                                                                                                          Fast retransmit algorithm

                                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                          start timer

                                                                                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                          resend segment with sequence number y

                                                                                                          a duplicate ACK for already ACKed segment

                                                                                                          fast retransmit

                                                                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                                                                          TCP GBN or Selective Repeat

                                                                                                          Basic TCP looks a lot like GBN

                                                                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                          This looks a lot like Selective Repeat

                                                                                                          TCP is a hybrid

                                                                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                                                                          Chapter 3 outline

                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                          35 Connection-oriented transport TCP

                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                                                                          TCP Flow Control

                                                                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                          transmitting too muchtoo fast

                                                                                                          flow controlreceive side of TCP connection has a receive buffer

                                                                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                          app process may be slow at reading from buffer

                                                                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                                                                          TCP segment structure

                                                                                                          source port dest port

                                                                                                          32 bits

                                                                                                          applicationdata

                                                                                                          (variable length)

                                                                                                          sequence numberacknowledgement number

                                                                                                          Receive windowUrg data pnterchecksum

                                                                                                          FSRPAUheadlen

                                                                                                          notused

                                                                                                          Options (variable length)

                                                                                                          URG urgent data (generally not used)

                                                                                                          ACK ACK valid

                                                                                                          PSH push data now(generally not used)

                                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                                          commands)

                                                                                                          bytes rcvr willingto accept

                                                                                                          Internetchecksum

                                                                                                          (as in UDP)

                                                                                                          countingby bytes of data(not segments)

                                                                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                                                                          TCP Flow control how it works

                                                                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                          LastByteRead]

                                                                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                          guarantees receive buffer doesnrsquot overflow

                                                                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                                                                          Technical Issue

                                                                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                                                                          Note on UDP

                                                                                                          UDP has no flow control

                                                                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                                                          Chapter 3 outline

                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                          35 Connection-oriented transport TCP

                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                                                          TCP Connection Management

                                                                                                          Three way handshakeStep 1 client end system sends

                                                                                                          TCP SYN control segment to server

                                                                                                          specifies client_isn the initial seq No application data

                                                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                                                          TCP Connection Management (cont)

                                                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                          Allocate buffersAllocates buffersCan include application data

                                                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                                                          server

                                                                                                          Connection granted (SYN=1 server_isn

                                                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                                                          ack=client_isn+1)

                                                                                                          ack=server_isn+1

                                                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                                                          TCP Connection Management (cont)

                                                                                                          Closing a connection

                                                                                                          client closes socketclientSocketclose()

                                                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                          client

                                                                                                          FIN

                                                                                                          server

                                                                                                          ACK

                                                                                                          ACK

                                                                                                          FIN

                                                                                                          close

                                                                                                          close

                                                                                                          closed

                                                                                                          tim

                                                                                                          ed w

                                                                                                          ait

                                                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                                                          TCP Connection Management (cont)

                                                                                                          Step 3 client receives FIN replies with ACK

                                                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                          Closes down after timed-wait

                                                                                                          Step 4 server receives ACK Connection closed

                                                                                                          Note with small modification can handle simultaneous FINs

                                                                                                          client

                                                                                                          FIN

                                                                                                          server

                                                                                                          ACK

                                                                                                          ACK

                                                                                                          FIN

                                                                                                          closing

                                                                                                          closing

                                                                                                          closed

                                                                                                          tim

                                                                                                          ed w

                                                                                                          ait

                                                                                                          closed

                                                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                                                          TCP Connection Management (cont)

                                                                                                          ExampleTCP serverlifecycle

                                                                                                          Example TCP clientlifecycle

                                                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                                                          A few special cases

                                                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                                                          Chapter 3 outline

                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                          35 Connection-oriented transport TCP

                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                                          Principles of Congestion Control

                                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                          a top-10 problem

                                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                          large delays when congestedmaximum achievable throughput

                                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                                          Causescosts of congestion scenario 2

                                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                          λin λout=

                                                                                                          λin λoutgtλ

                                                                                                          inλout

                                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                          (c)(a) (b)

                                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                          λin

                                                                                                          Q what happens as and increase λ

                                                                                                          in

                                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                                          Causescosts of congestion scenario 3

                                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                                          Approaches towards congestion control

                                                                                                          Two broad approaches towards congestion control

                                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                                          Case study ATM ABR congestion control

                                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                                          small exception ndash see next page

                                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                          sender should use available bandwidth

                                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                          Case study ATM ABR congestion control

                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                          Chapter 3 outline

                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                          35 Connection-oriented transport TCP

                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                          Congwin

                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                          cut CongWin in half after loss event

                                                                                                          8 Kbytes

                                                                                                          16 Kbytes

                                                                                                          24 Kbytes

                                                                                                          time

                                                                                                          congestionwindow

                                                                                                          Long-lived TCP connection

                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                          TCP Slow Start

                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                          TCP Slow Start (more)

                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                          Host A

                                                                                                          one segment

                                                                                                          RTT

                                                                                                          Host B

                                                                                                          time

                                                                                                          two segments

                                                                                                          four segments

                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                          Summary TCP Congestion Control

                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                          The Big Picture

                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                          ACK receipt for previously unackeddata

                                                                                                          Slow Start (SS)

                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                          ACK receipt for previously unackeddata

                                                                                                          CongestionAvoidance (CA)

                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                          Enter slow start

                                                                                                          Duplicate ACK

                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                          CongWin and Threshold not changed

                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                          TCP throughput

                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                          TCP Futures

                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                          LRTTMSSsdot221

                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                          TCP connection 1

                                                                                                          bottleneckrouter

                                                                                                          capacity R

                                                                                                          TCP connection 2

                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                          R

                                                                                                          R

                                                                                                          equal bandwidth share

                                                                                                          Connection 1 throughput

                                                                                                          Conn

                                                                                                          ecti

                                                                                                          on 2

                                                                                                          thr

                                                                                                          ough

                                                                                                          p ut

                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                          Fairness (more)Fairness and UDP

                                                                                                          Multimedia apps often do not use TCP

                                                                                                          do not want rate throttled by congestion control

                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                          modeling slow start

                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                          Fixed congestion window (1)

                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                          latency = 2RTT + OR

                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                          Fixed congestion window (2)

                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                          Will show that the delay for one object is

                                                                                                          RS

                                                                                                          RSRTTP

                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                          - and K is the number of windows that cover the object

                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                          RTT

                                                                                                          initiate TCPconnection

                                                                                                          requestobject

                                                                                                          first window= SR

                                                                                                          second window= 2SR

                                                                                                          third window= 4SR

                                                                                                          fourth window= 8SR

                                                                                                          completetransmissionobject

                                                                                                          delivered

                                                                                                          time atclient

                                                                                                          time atserver

                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                          Server idles P=2 times

                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                          Server idles P = minK-1Q times

                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                          TCP Latency Modeling (3)

                                                                                                          ementacknowledg receivesserver until

                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                          RS

                                                                                                          RSRTTPRTT

                                                                                                          RO

                                                                                                          RSRTT

                                                                                                          RSRTT

                                                                                                          RO

                                                                                                          idleTimeRTTRO

                                                                                                          P

                                                                                                          kP

                                                                                                          k

                                                                                                          P

                                                                                                          pp

                                                                                                          )12(][2

                                                                                                          ]2[2

                                                                                                          2delay

                                                                                                          1

                                                                                                          1

                                                                                                          1

                                                                                                          minusminus+++=

                                                                                                          minus+++=

                                                                                                          ++=

                                                                                                          minus

                                                                                                          =

                                                                                                          =

                                                                                                          sum

                                                                                                          sum

                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                          RS k =⎥⎦

                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                          +minus

                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                          RSk

                                                                                                          RTT

                                                                                                          initiate TCPconnection

                                                                                                          requestobject

                                                                                                          first window= SR

                                                                                                          second window= 2SR

                                                                                                          third window= 4SR

                                                                                                          fourth window= 8SR

                                                                                                          completetransmissionobject

                                                                                                          delivered

                                                                                                          time atclient

                                                                                                          time atserver

                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                          How do we calculate K

                                                                                                          ⎥⎥⎤

                                                                                                          ⎢⎢⎡ +=

                                                                                                          +ge=

                                                                                                          geminus=

                                                                                                          ge+++=

                                                                                                          ge+++=minus

                                                                                                          minus

                                                                                                          )1(log

                                                                                                          )1(logmin

                                                                                                          12min

                                                                                                          222min222min

                                                                                                          2

                                                                                                          2

                                                                                                          110

                                                                                                          110

                                                                                                          SO

                                                                                                          SOkk

                                                                                                          SOk

                                                                                                          SOkOSSSkK

                                                                                                          k

                                                                                                          k

                                                                                                          k

                                                                                                          L

                                                                                                          L

                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                          02468

                                                                                                          101214161820

                                                                                                          28Kbps

                                                                                                          100Kbps

                                                                                                          1 Mbps 10Mbps

                                                                                                          non-persistent

                                                                                                          persistent

                                                                                                          parallel non-persistent

                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                          HTTP Response time (in seconds)

                                                                                                          0

                                                                                                          10

                                                                                                          20

                                                                                                          30

                                                                                                          40

                                                                                                          50

                                                                                                          60

                                                                                                          70

                                                                                                          28Kbps

                                                                                                          100Kbps

                                                                                                          1 Mbps 10Mbps

                                                                                                          non-persistent

                                                                                                          persistent

                                                                                                          parallel non-persistent

                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                          instantiation and implementation in the Internet

                                                                                                          UDPTCP

                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                          • Chapter 3 outline
                                                                                                          • Transport services and protocols
                                                                                                          • Transport vs network layer
                                                                                                          • Transport-layer protocols
                                                                                                          • Chapter 3 outline
                                                                                                          • Multiplexingdemultiplexing
                                                                                                          • Multiplexingdemultiplexing
                                                                                                          • How demultiplexing works
                                                                                                          • Connectionless demultiplexing
                                                                                                          • Connectionless demux (cont)
                                                                                                          • Connection-oriented demux
                                                                                                          • Connection-oriented demux (cont)
                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                          • Chapter 3 outline
                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                          • UDP more
                                                                                                          • UDP checksum
                                                                                                          • Chapter 3 outline
                                                                                                          • Principles of Reliable data transfer
                                                                                                          • Reliable data transfer getting started
                                                                                                          • Reliable data transfer getting started
                                                                                                          • Incremental Improvements
                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                          • Rdt20 channel with bit errors
                                                                                                          • rdt20 FSM specification
                                                                                                          • rdt20 operation with no errors
                                                                                                          • rdt20 error scenario
                                                                                                          • rdt20 has a fatal flaw
                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                          • rdt21 discussion
                                                                                                          • rdt22 a NAK-free protocol
                                                                                                          • rdt22 sender receiver fragments
                                                                                                          • rdt30 channels with errors and loss
                                                                                                          • rdt30 sender
                                                                                                          • rdt30 in action
                                                                                                          • rdt30 in action
                                                                                                          • Performance of rdt30
                                                                                                          • rdt30 stop-and-wait operation
                                                                                                          • Pipelined protocols
                                                                                                          • Pipelined protocols
                                                                                                          • Pipelining increased utilization
                                                                                                          • Go-Back-N
                                                                                                          • GBN Sender
                                                                                                          • GBN sender extended FSM
                                                                                                          • GBN receiver extended FSM
                                                                                                          • More on receiver
                                                                                                          • GBN inaction
                                                                                                          • Selective Repeat
                                                                                                          • Selective repeat sender receiver windows
                                                                                                          • Selective repeat
                                                                                                          • Selective repeat in action
                                                                                                          • Selective repeat dilemma
                                                                                                          • Chapter 3 outline
                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                          • More TCP Details
                                                                                                          • Even More TCP Details
                                                                                                          • TCP segment structure
                                                                                                          • TCP seq rsquos and ACKs
                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                          • Example RTT estimation
                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                          • Chapter 3 outline
                                                                                                          • TCP reliable data transfer
                                                                                                          • TCP sender events
                                                                                                          • TCP sender(simplified)
                                                                                                          • TCP retransmission scenarios
                                                                                                          • TCP retransmission scenarios (more)
                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                          • More on Sender Policies
                                                                                                          • Fast Retransmit
                                                                                                          • Fast retransmit algorithm
                                                                                                          • TCP GBN or Selective Repeat
                                                                                                          • Chapter 3 outline
                                                                                                          • TCP Flow Control
                                                                                                          • TCP Flow Control
                                                                                                          • TCP segment structure
                                                                                                          • TCP Flow control how it works
                                                                                                          • Technical Issue
                                                                                                          • Chapter 3 outline
                                                                                                          • TCP Connection Management
                                                                                                          • TCP Connection Management (cont)
                                                                                                          • TCP Connection Management (cont)
                                                                                                          • TCP Connection Management (cont)
                                                                                                          • TCP Connection Management (cont)
                                                                                                          • A few special cases
                                                                                                          • Chapter 3 outline
                                                                                                          • Principles of Congestion Control
                                                                                                          • Causescosts of congestion scenario 1
                                                                                                          • Causescosts of congestion scenario 2
                                                                                                          • Causescosts of congestion scenario 3
                                                                                                          • Causescosts of congestion scenario 3
                                                                                                          • Approaches towards congestion control
                                                                                                          • Case study ATM ABR congestion control
                                                                                                          • Case study ATM ABR congestion control
                                                                                                          • Chapter 3 outline
                                                                                                          • TCP Congestion Control
                                                                                                          • TCP AIMD
                                                                                                          • TCP Slow Start
                                                                                                          • TCP Slow Start (more)
                                                                                                          • Summary TCP Congestion Control
                                                                                                          • The Big Picture
                                                                                                          • TCP sender congestion control
                                                                                                          • TCP throughput
                                                                                                          • TCP Futures
                                                                                                          • TCP Fairness
                                                                                                          • Why is TCP fair
                                                                                                          • Fairness (more)
                                                                                                          • TCP Latency Modeling
                                                                                                          • Fixed Congestion Window (W)
                                                                                                          • Fixed congestion window (1)
                                                                                                          • Fixed congestion window (2)
                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                          • TCP Latency Modeling (3)
                                                                                                          • TCP Latency Modeling (4)
                                                                                                          • HTTP Modeling
                                                                                                          • Chapter 3 Summary

                                                                                                            3 Transport Layer 54Comp 361 Spring 2005

                                                                                                            Selective repeat

                                                                                                            pkt n in [rcvbase rcvbase+N-1]

                                                                                                            send ACK(n)out-of-order bufferin-order deliver (also deliver buffered in-order pkts) advance window to next not-yet-received pkt

                                                                                                            pkt n in [rcvbase-Nrcvbase-1]

                                                                                                            ACK(n) (note this is a reACK)

                                                                                                            otherwiseignore

                                                                                                            receiverdata from above

                                                                                                            if next available seq in window send pkt

                                                                                                            timeout(n)resend pkt n restart timer

                                                                                                            ACK(n) in [sendbasesendbase+N]

                                                                                                            mark pkt n as receivedif n smallest unACKed pkt advance window base to next unACKed seq

                                                                                                            sender

                                                                                                            3 Transport Layer 55Comp 361 Spring 2005

                                                                                                            Selective repeat in action

                                                                                                            3 Transport Layer 56Comp 361 Spring 2005

                                                                                                            Selective repeatdilemma

                                                                                                            Example seq rsquos 0 1 2 3window size=3

                                                                                                            receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                            Q what is relationship between seq size and window size

                                                                                                            3 Transport Layer 57Comp 361 Spring 2005

                                                                                                            Chapter 3 outline

                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                            35 Connection-oriented transport TCP

                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                            3 Transport Layer 58Comp 361 Spring 2005

                                                                                                            TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                            full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                            connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                            flow controlledsender will not overwhelm receiver

                                                                                                            point-to-pointone sender one receiver

                                                                                                            reliable in-order byte steam

                                                                                                            no ldquomessage boundariesrdquopipelined

                                                                                                            TCP congestion and flow control set window size

                                                                                                            send amp receive buffers

                                                                                                            socketdoor

                                                                                                            TCPsend buffer

                                                                                                            TCPreceive buffer

                                                                                                            socketdoor

                                                                                                            segment

                                                                                                            applicationwrites data

                                                                                                            applicationreads data

                                                                                                            3 Transport Layer 59Comp 361 Spring 2005

                                                                                                            More TCP DetailsMaximum Segment Size (MSS)

                                                                                                            Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                            Application Data + TCP Header = TCP Segment

                                                                                                            Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                            (again no payload)Client responds with third special segment

                                                                                                            This can contain payload

                                                                                                            3 Transport Layer 60Comp 361 Spring 2005

                                                                                                            Even More TCP Details

                                                                                                            A TCP connection between client and server creates in both client and server

                                                                                                            (i) buffers(ii) variables and

                                                                                                            (iii) a socket connection to process

                                                                                                            TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                            any of the network elements between the host and server

                                                                                                            3 Transport Layer 61Comp 361 Spring 2005

                                                                                                            TCP segment structure

                                                                                                            source port dest port

                                                                                                            32 bits

                                                                                                            applicationdata

                                                                                                            (variable length)

                                                                                                            sequence numberacknowledgement number

                                                                                                            Receive windowUrg data pnterchecksum

                                                                                                            FSRPAUheadlen

                                                                                                            notused

                                                                                                            Options (variable length)

                                                                                                            URG urgent data (generally not used)

                                                                                                            ACK ACK valid

                                                                                                            PSH push data now(generally not used)

                                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                                            commands)

                                                                                                            bytes rcvr willingto accept

                                                                                                            Internetchecksum

                                                                                                            (as in UDP)

                                                                                                            countingby bytes of data(not segments)

                                                                                                            3 Transport Layer 62Comp 361 Spring 2005

                                                                                                            TCP seq rsquos and ACKsSeq rsquos

                                                                                                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                            ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                            Q how receiver handles out-of-order segments

                                                                                                            A TCP spec doesnrsquot say - up to implementer

                                                                                                            Host BHost A

                                                                                                            Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                            Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                            Seq=43 ACK=80

                                                                                                            Usertypes

                                                                                                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                            back lsquoCrsquo

                                                                                                            host ACKsreceipt

                                                                                                            of echoedlsquoCrsquo

                                                                                                            timesimple telnet scenario

                                                                                                            3 Transport Layer 63Comp 361 Spring 2005

                                                                                                            TCP Round Trip Time and Timeout

                                                                                                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                            average several recent measurements not just current SampleRTT

                                                                                                            Q how to set TCP timeout valuelonger than RTT

                                                                                                            but RTT variestoo short premature timeout

                                                                                                            unnecessary retransmissions

                                                                                                            too long slow reaction to segment loss

                                                                                                            3 Transport Layer 64Comp 361 Spring 2005

                                                                                                            TCP Round Trip Time and Timeout

                                                                                                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                            3 Transport Layer 65Comp 361 Spring 2005

                                                                                                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                            100

                                                                                                            150

                                                                                                            200

                                                                                                            250

                                                                                                            300

                                                                                                            350

                                                                                                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                            time (seconnds)

                                                                                                            RTT

                                                                                                            (mill

                                                                                                            iseco

                                                                                                            nds)

                                                                                                            SampleRTT Estimated RTT

                                                                                                            3 Transport Layer 66Comp 361 Spring 2005

                                                                                                            TCP Round Trip Time and Timeout

                                                                                                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                            (typically β = 025)

                                                                                                            Then set timeout interval

                                                                                                            TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                            3 Transport Layer 67Comp 361 Spring 2005

                                                                                                            Chapter 3 outline

                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                            35 Connection-oriented transport TCP

                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                            3 Transport Layer 68Comp 361 Spring 2005

                                                                                                            TCP reliable data transfer

                                                                                                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                            Retransmissions are triggered by

                                                                                                            timeout eventsduplicate acks

                                                                                                            Initially consider simplified TCP sender

                                                                                                            ignore duplicate acksignore flow control congestion control

                                                                                                            3 Transport Layer 69Comp 361 Spring 2005

                                                                                                            TCP sender eventsdata rcvd from app

                                                                                                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                            timeoutretransmit segment that caused timeoutrestart timer

                                                                                                            Ack rcvdIf acknowledges previously unackedsegments

                                                                                                            update what is known to be ackedstart timer if there are outstanding segments

                                                                                                            TCP sender(simplified)

                                                                                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                            loop (forever) switch(event)

                                                                                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                            smallest sequence numberstart timer

                                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                            start timer

                                                                                                            end of loop forever

                                                                                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                            3 Transport Layer 70Comp 361 Spring 2005

                                                                                                            3 Transport Layer 71Comp 361 Spring 2005

                                                                                                            TCP retransmission scenariosHost A

                                                                                                            Seq=100 20 bytes data

                                                                                                            ACK=100

                                                                                                            timepremature timeout

                                                                                                            Host B

                                                                                                            Seq=92 8 bytes data

                                                                                                            ACK=120

                                                                                                            Seq=92 8 bytes data

                                                                                                            Seq=

                                                                                                            92 t

                                                                                                            imeo

                                                                                                            ut

                                                                                                            ACK=120

                                                                                                            Host A

                                                                                                            Seq=92 8 bytes data

                                                                                                            ACK=100

                                                                                                            loss

                                                                                                            tim

                                                                                                            eout

                                                                                                            lost ACK scenario

                                                                                                            Host B

                                                                                                            X

                                                                                                            Seq=92 8 bytes data

                                                                                                            ACK=100

                                                                                                            time

                                                                                                            SendBase= 120

                                                                                                            SendBase= 120

                                                                                                            Sendbase= 100

                                                                                                            Seq=

                                                                                                            92 t

                                                                                                            imeo

                                                                                                            utSendBase

                                                                                                            = 100

                                                                                                            3 Transport Layer 72Comp 361 Spring 2005

                                                                                                            TCP retransmission scenarios (more)Host A

                                                                                                            Seq=92 8 bytes data

                                                                                                            ACK=100

                                                                                                            loss

                                                                                                            tim

                                                                                                            eout

                                                                                                            Cumulative ACK scenario

                                                                                                            Host B

                                                                                                            X

                                                                                                            Seq=100 20 bytes data

                                                                                                            ACK=120

                                                                                                            time

                                                                                                            SendBase= 120

                                                                                                            3 Transport Layer 73Comp 361 Spring 2005

                                                                                                            TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                            Event at Receiver

                                                                                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                            Arrival of segment that partially or completely fills gap

                                                                                                            TCP Receiver action

                                                                                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                            Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                            Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                            3 Transport Layer 74Comp 361 Spring 2005

                                                                                                            More on Sender Policies

                                                                                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                            3 Transport Layer 75Comp 361 Spring 2005

                                                                                                            Fast Retransmit

                                                                                                            Time-out period often relatively long

                                                                                                            long delay before resending lost packet

                                                                                                            Detect lost segments via duplicate ACKs

                                                                                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                            fast retransmit resend segment before timer expires

                                                                                                            3 Transport Layer 76Comp 361 Spring 2005

                                                                                                            Fast retransmit algorithm

                                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                            start timer

                                                                                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                            resend segment with sequence number y

                                                                                                            a duplicate ACK for already ACKed segment

                                                                                                            fast retransmit

                                                                                                            3 Transport Layer 77Comp 361 Spring 2005

                                                                                                            TCP GBN or Selective Repeat

                                                                                                            Basic TCP looks a lot like GBN

                                                                                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                            This looks a lot like Selective Repeat

                                                                                                            TCP is a hybrid

                                                                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                                                                            Chapter 3 outline

                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                            35 Connection-oriented transport TCP

                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                                                                            TCP Flow Control

                                                                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                            transmitting too muchtoo fast

                                                                                                            flow controlreceive side of TCP connection has a receive buffer

                                                                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                            app process may be slow at reading from buffer

                                                                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                                                                            TCP segment structure

                                                                                                            source port dest port

                                                                                                            32 bits

                                                                                                            applicationdata

                                                                                                            (variable length)

                                                                                                            sequence numberacknowledgement number

                                                                                                            Receive windowUrg data pnterchecksum

                                                                                                            FSRPAUheadlen

                                                                                                            notused

                                                                                                            Options (variable length)

                                                                                                            URG urgent data (generally not used)

                                                                                                            ACK ACK valid

                                                                                                            PSH push data now(generally not used)

                                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                                            commands)

                                                                                                            bytes rcvr willingto accept

                                                                                                            Internetchecksum

                                                                                                            (as in UDP)

                                                                                                            countingby bytes of data(not segments)

                                                                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                                                                            TCP Flow control how it works

                                                                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                            LastByteRead]

                                                                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                            guarantees receive buffer doesnrsquot overflow

                                                                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                                                                            Technical Issue

                                                                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                                                                            Note on UDP

                                                                                                            UDP has no flow control

                                                                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                                                                            Chapter 3 outline

                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                            35 Connection-oriented transport TCP

                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                                                            TCP Connection Management

                                                                                                            Three way handshakeStep 1 client end system sends

                                                                                                            TCP SYN control segment to server

                                                                                                            specifies client_isn the initial seq No application data

                                                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                                                            TCP Connection Management (cont)

                                                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                            Allocate buffersAllocates buffersCan include application data

                                                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                                                            server

                                                                                                            Connection granted (SYN=1 server_isn

                                                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                                                            ack=client_isn+1)

                                                                                                            ack=server_isn+1

                                                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                                                            TCP Connection Management (cont)

                                                                                                            Closing a connection

                                                                                                            client closes socketclientSocketclose()

                                                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                            client

                                                                                                            FIN

                                                                                                            server

                                                                                                            ACK

                                                                                                            ACK

                                                                                                            FIN

                                                                                                            close

                                                                                                            close

                                                                                                            closed

                                                                                                            tim

                                                                                                            ed w

                                                                                                            ait

                                                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                                                            TCP Connection Management (cont)

                                                                                                            Step 3 client receives FIN replies with ACK

                                                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                            Closes down after timed-wait

                                                                                                            Step 4 server receives ACK Connection closed

                                                                                                            Note with small modification can handle simultaneous FINs

                                                                                                            client

                                                                                                            FIN

                                                                                                            server

                                                                                                            ACK

                                                                                                            ACK

                                                                                                            FIN

                                                                                                            closing

                                                                                                            closing

                                                                                                            closed

                                                                                                            tim

                                                                                                            ed w

                                                                                                            ait

                                                                                                            closed

                                                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                                                            TCP Connection Management (cont)

                                                                                                            ExampleTCP serverlifecycle

                                                                                                            Example TCP clientlifecycle

                                                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                                                            A few special cases

                                                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                                                            Chapter 3 outline

                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                            35 Connection-oriented transport TCP

                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                                                            Principles of Congestion Control

                                                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                            a top-10 problem

                                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                            large delays when congestedmaximum achievable throughput

                                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                                            Causescosts of congestion scenario 2

                                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                            λin λout=

                                                                                                            λin λoutgtλ

                                                                                                            inλout

                                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                            (c)(a) (b)

                                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                            λin

                                                                                                            Q what happens as and increase λ

                                                                                                            in

                                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                                            Causescosts of congestion scenario 3

                                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                                            Approaches towards congestion control

                                                                                                            Two broad approaches towards congestion control

                                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                                            Case study ATM ABR congestion control

                                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                                            small exception ndash see next page

                                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                            sender should use available bandwidth

                                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                                            Case study ATM ABR congestion control

                                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                            Chapter 3 outline

                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                            35 Connection-oriented transport TCP

                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                            Congwin

                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                            cut CongWin in half after loss event

                                                                                                            8 Kbytes

                                                                                                            16 Kbytes

                                                                                                            24 Kbytes

                                                                                                            time

                                                                                                            congestionwindow

                                                                                                            Long-lived TCP connection

                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                            TCP Slow Start

                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                            TCP Slow Start (more)

                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                            Host A

                                                                                                            one segment

                                                                                                            RTT

                                                                                                            Host B

                                                                                                            time

                                                                                                            two segments

                                                                                                            four segments

                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                            Summary TCP Congestion Control

                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                            The Big Picture

                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                            ACK receipt for previously unackeddata

                                                                                                            Slow Start (SS)

                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                            ACK receipt for previously unackeddata

                                                                                                            CongestionAvoidance (CA)

                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                            Enter slow start

                                                                                                            Duplicate ACK

                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                            CongWin and Threshold not changed

                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                            TCP throughput

                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                            TCP Futures

                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                            LRTTMSSsdot221

                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                            TCP connection 1

                                                                                                            bottleneckrouter

                                                                                                            capacity R

                                                                                                            TCP connection 2

                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                            R

                                                                                                            R

                                                                                                            equal bandwidth share

                                                                                                            Connection 1 throughput

                                                                                                            Conn

                                                                                                            ecti

                                                                                                            on 2

                                                                                                            thr

                                                                                                            ough

                                                                                                            p ut

                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                            Fairness (more)Fairness and UDP

                                                                                                            Multimedia apps often do not use TCP

                                                                                                            do not want rate throttled by congestion control

                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                            modeling slow start

                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                            Fixed congestion window (1)

                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                            latency = 2RTT + OR

                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                            Fixed congestion window (2)

                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                            Will show that the delay for one object is

                                                                                                            RS

                                                                                                            RSRTTP

                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                            - and K is the number of windows that cover the object

                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                            RTT

                                                                                                            initiate TCPconnection

                                                                                                            requestobject

                                                                                                            first window= SR

                                                                                                            second window= 2SR

                                                                                                            third window= 4SR

                                                                                                            fourth window= 8SR

                                                                                                            completetransmissionobject

                                                                                                            delivered

                                                                                                            time atclient

                                                                                                            time atserver

                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                            Server idles P=2 times

                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                            Server idles P = minK-1Q times

                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                            TCP Latency Modeling (3)

                                                                                                            ementacknowledg receivesserver until

                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                            RS

                                                                                                            RSRTTPRTT

                                                                                                            RO

                                                                                                            RSRTT

                                                                                                            RSRTT

                                                                                                            RO

                                                                                                            idleTimeRTTRO

                                                                                                            P

                                                                                                            kP

                                                                                                            k

                                                                                                            P

                                                                                                            pp

                                                                                                            )12(][2

                                                                                                            ]2[2

                                                                                                            2delay

                                                                                                            1

                                                                                                            1

                                                                                                            1

                                                                                                            minusminus+++=

                                                                                                            minus+++=

                                                                                                            ++=

                                                                                                            minus

                                                                                                            =

                                                                                                            =

                                                                                                            sum

                                                                                                            sum

                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                            RS k =⎥⎦

                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                            +minus

                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                            RSk

                                                                                                            RTT

                                                                                                            initiate TCPconnection

                                                                                                            requestobject

                                                                                                            first window= SR

                                                                                                            second window= 2SR

                                                                                                            third window= 4SR

                                                                                                            fourth window= 8SR

                                                                                                            completetransmissionobject

                                                                                                            delivered

                                                                                                            time atclient

                                                                                                            time atserver

                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                            How do we calculate K

                                                                                                            ⎥⎥⎤

                                                                                                            ⎢⎢⎡ +=

                                                                                                            +ge=

                                                                                                            geminus=

                                                                                                            ge+++=

                                                                                                            ge+++=minus

                                                                                                            minus

                                                                                                            )1(log

                                                                                                            )1(logmin

                                                                                                            12min

                                                                                                            222min222min

                                                                                                            2

                                                                                                            2

                                                                                                            110

                                                                                                            110

                                                                                                            SO

                                                                                                            SOkk

                                                                                                            SOk

                                                                                                            SOkOSSSkK

                                                                                                            k

                                                                                                            k

                                                                                                            k

                                                                                                            L

                                                                                                            L

                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                            02468

                                                                                                            101214161820

                                                                                                            28Kbps

                                                                                                            100Kbps

                                                                                                            1 Mbps 10Mbps

                                                                                                            non-persistent

                                                                                                            persistent

                                                                                                            parallel non-persistent

                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                            HTTP Response time (in seconds)

                                                                                                            0

                                                                                                            10

                                                                                                            20

                                                                                                            30

                                                                                                            40

                                                                                                            50

                                                                                                            60

                                                                                                            70

                                                                                                            28Kbps

                                                                                                            100Kbps

                                                                                                            1 Mbps 10Mbps

                                                                                                            non-persistent

                                                                                                            persistent

                                                                                                            parallel non-persistent

                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                            instantiation and implementation in the Internet

                                                                                                            UDPTCP

                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                            • Chapter 3 outline
                                                                                                            • Transport services and protocols
                                                                                                            • Transport vs network layer
                                                                                                            • Transport-layer protocols
                                                                                                            • Chapter 3 outline
                                                                                                            • Multiplexingdemultiplexing
                                                                                                            • Multiplexingdemultiplexing
                                                                                                            • How demultiplexing works
                                                                                                            • Connectionless demultiplexing
                                                                                                            • Connectionless demux (cont)
                                                                                                            • Connection-oriented demux
                                                                                                            • Connection-oriented demux (cont)
                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                            • Chapter 3 outline
                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                            • UDP more
                                                                                                            • UDP checksum
                                                                                                            • Chapter 3 outline
                                                                                                            • Principles of Reliable data transfer
                                                                                                            • Reliable data transfer getting started
                                                                                                            • Reliable data transfer getting started
                                                                                                            • Incremental Improvements
                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                            • Rdt20 channel with bit errors
                                                                                                            • rdt20 FSM specification
                                                                                                            • rdt20 operation with no errors
                                                                                                            • rdt20 error scenario
                                                                                                            • rdt20 has a fatal flaw
                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                            • rdt21 discussion
                                                                                                            • rdt22 a NAK-free protocol
                                                                                                            • rdt22 sender receiver fragments
                                                                                                            • rdt30 channels with errors and loss
                                                                                                            • rdt30 sender
                                                                                                            • rdt30 in action
                                                                                                            • rdt30 in action
                                                                                                            • Performance of rdt30
                                                                                                            • rdt30 stop-and-wait operation
                                                                                                            • Pipelined protocols
                                                                                                            • Pipelined protocols
                                                                                                            • Pipelining increased utilization
                                                                                                            • Go-Back-N
                                                                                                            • GBN Sender
                                                                                                            • GBN sender extended FSM
                                                                                                            • GBN receiver extended FSM
                                                                                                            • More on receiver
                                                                                                            • GBN inaction
                                                                                                            • Selective Repeat
                                                                                                            • Selective repeat sender receiver windows
                                                                                                            • Selective repeat
                                                                                                            • Selective repeat in action
                                                                                                            • Selective repeat dilemma
                                                                                                            • Chapter 3 outline
                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                            • More TCP Details
                                                                                                            • Even More TCP Details
                                                                                                            • TCP segment structure
                                                                                                            • TCP seq rsquos and ACKs
                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                            • Example RTT estimation
                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                            • Chapter 3 outline
                                                                                                            • TCP reliable data transfer
                                                                                                            • TCP sender events
                                                                                                            • TCP sender(simplified)
                                                                                                            • TCP retransmission scenarios
                                                                                                            • TCP retransmission scenarios (more)
                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                            • More on Sender Policies
                                                                                                            • Fast Retransmit
                                                                                                            • Fast retransmit algorithm
                                                                                                            • TCP GBN or Selective Repeat
                                                                                                            • Chapter 3 outline
                                                                                                            • TCP Flow Control
                                                                                                            • TCP Flow Control
                                                                                                            • TCP segment structure
                                                                                                            • TCP Flow control how it works
                                                                                                            • Technical Issue
                                                                                                            • Chapter 3 outline
                                                                                                            • TCP Connection Management
                                                                                                            • TCP Connection Management (cont)
                                                                                                            • TCP Connection Management (cont)
                                                                                                            • TCP Connection Management (cont)
                                                                                                            • TCP Connection Management (cont)
                                                                                                            • A few special cases
                                                                                                            • Chapter 3 outline
                                                                                                            • Principles of Congestion Control
                                                                                                            • Causescosts of congestion scenario 1
                                                                                                            • Causescosts of congestion scenario 2
                                                                                                            • Causescosts of congestion scenario 3
                                                                                                            • Causescosts of congestion scenario 3
                                                                                                            • Approaches towards congestion control
                                                                                                            • Case study ATM ABR congestion control
                                                                                                            • Case study ATM ABR congestion control
                                                                                                            • Chapter 3 outline
                                                                                                            • TCP Congestion Control
                                                                                                            • TCP AIMD
                                                                                                            • TCP Slow Start
                                                                                                            • TCP Slow Start (more)
                                                                                                            • Summary TCP Congestion Control
                                                                                                            • The Big Picture
                                                                                                            • TCP sender congestion control
                                                                                                            • TCP throughput
                                                                                                            • TCP Futures
                                                                                                            • TCP Fairness
                                                                                                            • Why is TCP fair
                                                                                                            • Fairness (more)
                                                                                                            • TCP Latency Modeling
                                                                                                            • Fixed Congestion Window (W)
                                                                                                            • Fixed congestion window (1)
                                                                                                            • Fixed congestion window (2)
                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                            • TCP Latency Modeling (3)
                                                                                                            • TCP Latency Modeling (4)
                                                                                                            • HTTP Modeling
                                                                                                            • Chapter 3 Summary

                                                                                                              3 Transport Layer 55Comp 361 Spring 2005

                                                                                                              Selective repeat in action

                                                                                                              3 Transport Layer 56Comp 361 Spring 2005

                                                                                                              Selective repeatdilemma

                                                                                                              Example seq rsquos 0 1 2 3window size=3

                                                                                                              receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                              Q what is relationship between seq size and window size

                                                                                                              3 Transport Layer 57Comp 361 Spring 2005

                                                                                                              Chapter 3 outline

                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                              35 Connection-oriented transport TCP

                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                              3 Transport Layer 58Comp 361 Spring 2005

                                                                                                              TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                              full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                              connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                              flow controlledsender will not overwhelm receiver

                                                                                                              point-to-pointone sender one receiver

                                                                                                              reliable in-order byte steam

                                                                                                              no ldquomessage boundariesrdquopipelined

                                                                                                              TCP congestion and flow control set window size

                                                                                                              send amp receive buffers

                                                                                                              socketdoor

                                                                                                              TCPsend buffer

                                                                                                              TCPreceive buffer

                                                                                                              socketdoor

                                                                                                              segment

                                                                                                              applicationwrites data

                                                                                                              applicationreads data

                                                                                                              3 Transport Layer 59Comp 361 Spring 2005

                                                                                                              More TCP DetailsMaximum Segment Size (MSS)

                                                                                                              Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                              Application Data + TCP Header = TCP Segment

                                                                                                              Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                              (again no payload)Client responds with third special segment

                                                                                                              This can contain payload

                                                                                                              3 Transport Layer 60Comp 361 Spring 2005

                                                                                                              Even More TCP Details

                                                                                                              A TCP connection between client and server creates in both client and server

                                                                                                              (i) buffers(ii) variables and

                                                                                                              (iii) a socket connection to process

                                                                                                              TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                              any of the network elements between the host and server

                                                                                                              3 Transport Layer 61Comp 361 Spring 2005

                                                                                                              TCP segment structure

                                                                                                              source port dest port

                                                                                                              32 bits

                                                                                                              applicationdata

                                                                                                              (variable length)

                                                                                                              sequence numberacknowledgement number

                                                                                                              Receive windowUrg data pnterchecksum

                                                                                                              FSRPAUheadlen

                                                                                                              notused

                                                                                                              Options (variable length)

                                                                                                              URG urgent data (generally not used)

                                                                                                              ACK ACK valid

                                                                                                              PSH push data now(generally not used)

                                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                                              commands)

                                                                                                              bytes rcvr willingto accept

                                                                                                              Internetchecksum

                                                                                                              (as in UDP)

                                                                                                              countingby bytes of data(not segments)

                                                                                                              3 Transport Layer 62Comp 361 Spring 2005

                                                                                                              TCP seq rsquos and ACKsSeq rsquos

                                                                                                              byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                              ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                              Q how receiver handles out-of-order segments

                                                                                                              A TCP spec doesnrsquot say - up to implementer

                                                                                                              Host BHost A

                                                                                                              Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                              Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                              Seq=43 ACK=80

                                                                                                              Usertypes

                                                                                                              lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                              back lsquoCrsquo

                                                                                                              host ACKsreceipt

                                                                                                              of echoedlsquoCrsquo

                                                                                                              timesimple telnet scenario

                                                                                                              3 Transport Layer 63Comp 361 Spring 2005

                                                                                                              TCP Round Trip Time and Timeout

                                                                                                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                              average several recent measurements not just current SampleRTT

                                                                                                              Q how to set TCP timeout valuelonger than RTT

                                                                                                              but RTT variestoo short premature timeout

                                                                                                              unnecessary retransmissions

                                                                                                              too long slow reaction to segment loss

                                                                                                              3 Transport Layer 64Comp 361 Spring 2005

                                                                                                              TCP Round Trip Time and Timeout

                                                                                                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                              3 Transport Layer 65Comp 361 Spring 2005

                                                                                                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                              100

                                                                                                              150

                                                                                                              200

                                                                                                              250

                                                                                                              300

                                                                                                              350

                                                                                                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                              time (seconnds)

                                                                                                              RTT

                                                                                                              (mill

                                                                                                              iseco

                                                                                                              nds)

                                                                                                              SampleRTT Estimated RTT

                                                                                                              3 Transport Layer 66Comp 361 Spring 2005

                                                                                                              TCP Round Trip Time and Timeout

                                                                                                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                              (typically β = 025)

                                                                                                              Then set timeout interval

                                                                                                              TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                              3 Transport Layer 67Comp 361 Spring 2005

                                                                                                              Chapter 3 outline

                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                              35 Connection-oriented transport TCP

                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                              3 Transport Layer 68Comp 361 Spring 2005

                                                                                                              TCP reliable data transfer

                                                                                                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                              Retransmissions are triggered by

                                                                                                              timeout eventsduplicate acks

                                                                                                              Initially consider simplified TCP sender

                                                                                                              ignore duplicate acksignore flow control congestion control

                                                                                                              3 Transport Layer 69Comp 361 Spring 2005

                                                                                                              TCP sender eventsdata rcvd from app

                                                                                                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                              timeoutretransmit segment that caused timeoutrestart timer

                                                                                                              Ack rcvdIf acknowledges previously unackedsegments

                                                                                                              update what is known to be ackedstart timer if there are outstanding segments

                                                                                                              TCP sender(simplified)

                                                                                                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                              loop (forever) switch(event)

                                                                                                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                              event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                              smallest sequence numberstart timer

                                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                              start timer

                                                                                                              end of loop forever

                                                                                                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                              3 Transport Layer 70Comp 361 Spring 2005

                                                                                                              3 Transport Layer 71Comp 361 Spring 2005

                                                                                                              TCP retransmission scenariosHost A

                                                                                                              Seq=100 20 bytes data

                                                                                                              ACK=100

                                                                                                              timepremature timeout

                                                                                                              Host B

                                                                                                              Seq=92 8 bytes data

                                                                                                              ACK=120

                                                                                                              Seq=92 8 bytes data

                                                                                                              Seq=

                                                                                                              92 t

                                                                                                              imeo

                                                                                                              ut

                                                                                                              ACK=120

                                                                                                              Host A

                                                                                                              Seq=92 8 bytes data

                                                                                                              ACK=100

                                                                                                              loss

                                                                                                              tim

                                                                                                              eout

                                                                                                              lost ACK scenario

                                                                                                              Host B

                                                                                                              X

                                                                                                              Seq=92 8 bytes data

                                                                                                              ACK=100

                                                                                                              time

                                                                                                              SendBase= 120

                                                                                                              SendBase= 120

                                                                                                              Sendbase= 100

                                                                                                              Seq=

                                                                                                              92 t

                                                                                                              imeo

                                                                                                              utSendBase

                                                                                                              = 100

                                                                                                              3 Transport Layer 72Comp 361 Spring 2005

                                                                                                              TCP retransmission scenarios (more)Host A

                                                                                                              Seq=92 8 bytes data

                                                                                                              ACK=100

                                                                                                              loss

                                                                                                              tim

                                                                                                              eout

                                                                                                              Cumulative ACK scenario

                                                                                                              Host B

                                                                                                              X

                                                                                                              Seq=100 20 bytes data

                                                                                                              ACK=120

                                                                                                              time

                                                                                                              SendBase= 120

                                                                                                              3 Transport Layer 73Comp 361 Spring 2005

                                                                                                              TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                              Event at Receiver

                                                                                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                              Arrival of segment that partially or completely fills gap

                                                                                                              TCP Receiver action

                                                                                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                              Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                              Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                              3 Transport Layer 74Comp 361 Spring 2005

                                                                                                              More on Sender Policies

                                                                                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                              3 Transport Layer 75Comp 361 Spring 2005

                                                                                                              Fast Retransmit

                                                                                                              Time-out period often relatively long

                                                                                                              long delay before resending lost packet

                                                                                                              Detect lost segments via duplicate ACKs

                                                                                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                              fast retransmit resend segment before timer expires

                                                                                                              3 Transport Layer 76Comp 361 Spring 2005

                                                                                                              Fast retransmit algorithm

                                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                              start timer

                                                                                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                              resend segment with sequence number y

                                                                                                              a duplicate ACK for already ACKed segment

                                                                                                              fast retransmit

                                                                                                              3 Transport Layer 77Comp 361 Spring 2005

                                                                                                              TCP GBN or Selective Repeat

                                                                                                              Basic TCP looks a lot like GBN

                                                                                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                              This looks a lot like Selective Repeat

                                                                                                              TCP is a hybrid

                                                                                                              3 Transport Layer 78Comp 361 Spring 2005

                                                                                                              Chapter 3 outline

                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                              35 Connection-oriented transport TCP

                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                                                                              TCP Flow Control

                                                                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                              transmitting too muchtoo fast

                                                                                                              flow controlreceive side of TCP connection has a receive buffer

                                                                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                              app process may be slow at reading from buffer

                                                                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                                                                              TCP segment structure

                                                                                                              source port dest port

                                                                                                              32 bits

                                                                                                              applicationdata

                                                                                                              (variable length)

                                                                                                              sequence numberacknowledgement number

                                                                                                              Receive windowUrg data pnterchecksum

                                                                                                              FSRPAUheadlen

                                                                                                              notused

                                                                                                              Options (variable length)

                                                                                                              URG urgent data (generally not used)

                                                                                                              ACK ACK valid

                                                                                                              PSH push data now(generally not used)

                                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                                              commands)

                                                                                                              bytes rcvr willingto accept

                                                                                                              Internetchecksum

                                                                                                              (as in UDP)

                                                                                                              countingby bytes of data(not segments)

                                                                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                                                                              TCP Flow control how it works

                                                                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                              LastByteRead]

                                                                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                              guarantees receive buffer doesnrsquot overflow

                                                                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                                                                              Technical Issue

                                                                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                                                                              Note on UDP

                                                                                                              UDP has no flow control

                                                                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                                                                              Chapter 3 outline

                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                              35 Connection-oriented transport TCP

                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                                                                              TCP Connection Management

                                                                                                              Three way handshakeStep 1 client end system sends

                                                                                                              TCP SYN control segment to server

                                                                                                              specifies client_isn the initial seq No application data

                                                                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                              seq sbuffers flow control info (eg RcvWindow)

                                                                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                                                              TCP Connection Management (cont)

                                                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                              Allocate buffersAllocates buffersCan include application data

                                                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                                                              server

                                                                                                              Connection granted (SYN=1 server_isn

                                                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                                                              ack=client_isn+1)

                                                                                                              ack=server_isn+1

                                                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                                                              TCP Connection Management (cont)

                                                                                                              Closing a connection

                                                                                                              client closes socketclientSocketclose()

                                                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                              client

                                                                                                              FIN

                                                                                                              server

                                                                                                              ACK

                                                                                                              ACK

                                                                                                              FIN

                                                                                                              close

                                                                                                              close

                                                                                                              closed

                                                                                                              tim

                                                                                                              ed w

                                                                                                              ait

                                                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                                                              TCP Connection Management (cont)

                                                                                                              Step 3 client receives FIN replies with ACK

                                                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                              Closes down after timed-wait

                                                                                                              Step 4 server receives ACK Connection closed

                                                                                                              Note with small modification can handle simultaneous FINs

                                                                                                              client

                                                                                                              FIN

                                                                                                              server

                                                                                                              ACK

                                                                                                              ACK

                                                                                                              FIN

                                                                                                              closing

                                                                                                              closing

                                                                                                              closed

                                                                                                              tim

                                                                                                              ed w

                                                                                                              ait

                                                                                                              closed

                                                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                                                              TCP Connection Management (cont)

                                                                                                              ExampleTCP serverlifecycle

                                                                                                              Example TCP clientlifecycle

                                                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                                                              A few special cases

                                                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                                                              Chapter 3 outline

                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                              35 Connection-oriented transport TCP

                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                                                              Principles of Congestion Control

                                                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                              a top-10 problem

                                                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                              large delays when congestedmaximum achievable throughput

                                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                                              Causescosts of congestion scenario 2

                                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                              λin λout=

                                                                                                              λin λoutgtλ

                                                                                                              inλout

                                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                              (c)(a) (b)

                                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                              λin

                                                                                                              Q what happens as and increase λ

                                                                                                              in

                                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                                              Causescosts of congestion scenario 3

                                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                                              Approaches towards congestion control

                                                                                                              Two broad approaches towards congestion control

                                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                                              Case study ATM ABR congestion control

                                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                                              small exception ndash see next page

                                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                              sender should use available bandwidth

                                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                                              Case study ATM ABR congestion control

                                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                                              Chapter 3 outline

                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                              35 Connection-oriented transport TCP

                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                              Congwin

                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                              cut CongWin in half after loss event

                                                                                                              8 Kbytes

                                                                                                              16 Kbytes

                                                                                                              24 Kbytes

                                                                                                              time

                                                                                                              congestionwindow

                                                                                                              Long-lived TCP connection

                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                              TCP Slow Start

                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                              TCP Slow Start (more)

                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                              Host A

                                                                                                              one segment

                                                                                                              RTT

                                                                                                              Host B

                                                                                                              time

                                                                                                              two segments

                                                                                                              four segments

                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                              Summary TCP Congestion Control

                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                              The Big Picture

                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                              ACK receipt for previously unackeddata

                                                                                                              Slow Start (SS)

                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                              ACK receipt for previously unackeddata

                                                                                                              CongestionAvoidance (CA)

                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                              Enter slow start

                                                                                                              Duplicate ACK

                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                              CongWin and Threshold not changed

                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                              TCP throughput

                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                              TCP Futures

                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                              LRTTMSSsdot221

                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                              TCP connection 1

                                                                                                              bottleneckrouter

                                                                                                              capacity R

                                                                                                              TCP connection 2

                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                              R

                                                                                                              R

                                                                                                              equal bandwidth share

                                                                                                              Connection 1 throughput

                                                                                                              Conn

                                                                                                              ecti

                                                                                                              on 2

                                                                                                              thr

                                                                                                              ough

                                                                                                              p ut

                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                              Fairness (more)Fairness and UDP

                                                                                                              Multimedia apps often do not use TCP

                                                                                                              do not want rate throttled by congestion control

                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                              modeling slow start

                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                              Fixed congestion window (1)

                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                              latency = 2RTT + OR

                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                              Fixed congestion window (2)

                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                              Will show that the delay for one object is

                                                                                                              RS

                                                                                                              RSRTTP

                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                              - and K is the number of windows that cover the object

                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                              RTT

                                                                                                              initiate TCPconnection

                                                                                                              requestobject

                                                                                                              first window= SR

                                                                                                              second window= 2SR

                                                                                                              third window= 4SR

                                                                                                              fourth window= 8SR

                                                                                                              completetransmissionobject

                                                                                                              delivered

                                                                                                              time atclient

                                                                                                              time atserver

                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                              Server idles P=2 times

                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                              Server idles P = minK-1Q times

                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                              TCP Latency Modeling (3)

                                                                                                              ementacknowledg receivesserver until

                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                              RS

                                                                                                              RSRTTPRTT

                                                                                                              RO

                                                                                                              RSRTT

                                                                                                              RSRTT

                                                                                                              RO

                                                                                                              idleTimeRTTRO

                                                                                                              P

                                                                                                              kP

                                                                                                              k

                                                                                                              P

                                                                                                              pp

                                                                                                              )12(][2

                                                                                                              ]2[2

                                                                                                              2delay

                                                                                                              1

                                                                                                              1

                                                                                                              1

                                                                                                              minusminus+++=

                                                                                                              minus+++=

                                                                                                              ++=

                                                                                                              minus

                                                                                                              =

                                                                                                              =

                                                                                                              sum

                                                                                                              sum

                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                              RS k =⎥⎦

                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                              +minus

                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                              RSk

                                                                                                              RTT

                                                                                                              initiate TCPconnection

                                                                                                              requestobject

                                                                                                              first window= SR

                                                                                                              second window= 2SR

                                                                                                              third window= 4SR

                                                                                                              fourth window= 8SR

                                                                                                              completetransmissionobject

                                                                                                              delivered

                                                                                                              time atclient

                                                                                                              time atserver

                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                              How do we calculate K

                                                                                                              ⎥⎥⎤

                                                                                                              ⎢⎢⎡ +=

                                                                                                              +ge=

                                                                                                              geminus=

                                                                                                              ge+++=

                                                                                                              ge+++=minus

                                                                                                              minus

                                                                                                              )1(log

                                                                                                              )1(logmin

                                                                                                              12min

                                                                                                              222min222min

                                                                                                              2

                                                                                                              2

                                                                                                              110

                                                                                                              110

                                                                                                              SO

                                                                                                              SOkk

                                                                                                              SOk

                                                                                                              SOkOSSSkK

                                                                                                              k

                                                                                                              k

                                                                                                              k

                                                                                                              L

                                                                                                              L

                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                              02468

                                                                                                              101214161820

                                                                                                              28Kbps

                                                                                                              100Kbps

                                                                                                              1 Mbps 10Mbps

                                                                                                              non-persistent

                                                                                                              persistent

                                                                                                              parallel non-persistent

                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                              HTTP Response time (in seconds)

                                                                                                              0

                                                                                                              10

                                                                                                              20

                                                                                                              30

                                                                                                              40

                                                                                                              50

                                                                                                              60

                                                                                                              70

                                                                                                              28Kbps

                                                                                                              100Kbps

                                                                                                              1 Mbps 10Mbps

                                                                                                              non-persistent

                                                                                                              persistent

                                                                                                              parallel non-persistent

                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                              instantiation and implementation in the Internet

                                                                                                              UDPTCP

                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                              • Chapter 3 outline
                                                                                                              • Transport services and protocols
                                                                                                              • Transport vs network layer
                                                                                                              • Transport-layer protocols
                                                                                                              • Chapter 3 outline
                                                                                                              • Multiplexingdemultiplexing
                                                                                                              • Multiplexingdemultiplexing
                                                                                                              • How demultiplexing works
                                                                                                              • Connectionless demultiplexing
                                                                                                              • Connectionless demux (cont)
                                                                                                              • Connection-oriented demux
                                                                                                              • Connection-oriented demux (cont)
                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                              • Chapter 3 outline
                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                              • UDP more
                                                                                                              • UDP checksum
                                                                                                              • Chapter 3 outline
                                                                                                              • Principles of Reliable data transfer
                                                                                                              • Reliable data transfer getting started
                                                                                                              • Reliable data transfer getting started
                                                                                                              • Incremental Improvements
                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                              • Rdt20 channel with bit errors
                                                                                                              • rdt20 FSM specification
                                                                                                              • rdt20 operation with no errors
                                                                                                              • rdt20 error scenario
                                                                                                              • rdt20 has a fatal flaw
                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                              • rdt21 discussion
                                                                                                              • rdt22 a NAK-free protocol
                                                                                                              • rdt22 sender receiver fragments
                                                                                                              • rdt30 channels with errors and loss
                                                                                                              • rdt30 sender
                                                                                                              • rdt30 in action
                                                                                                              • rdt30 in action
                                                                                                              • Performance of rdt30
                                                                                                              • rdt30 stop-and-wait operation
                                                                                                              • Pipelined protocols
                                                                                                              • Pipelined protocols
                                                                                                              • Pipelining increased utilization
                                                                                                              • Go-Back-N
                                                                                                              • GBN Sender
                                                                                                              • GBN sender extended FSM
                                                                                                              • GBN receiver extended FSM
                                                                                                              • More on receiver
                                                                                                              • GBN inaction
                                                                                                              • Selective Repeat
                                                                                                              • Selective repeat sender receiver windows
                                                                                                              • Selective repeat
                                                                                                              • Selective repeat in action
                                                                                                              • Selective repeat dilemma
                                                                                                              • Chapter 3 outline
                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                              • More TCP Details
                                                                                                              • Even More TCP Details
                                                                                                              • TCP segment structure
                                                                                                              • TCP seq rsquos and ACKs
                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                              • Example RTT estimation
                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                              • Chapter 3 outline
                                                                                                              • TCP reliable data transfer
                                                                                                              • TCP sender events
                                                                                                              • TCP sender(simplified)
                                                                                                              • TCP retransmission scenarios
                                                                                                              • TCP retransmission scenarios (more)
                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                              • More on Sender Policies
                                                                                                              • Fast Retransmit
                                                                                                              • Fast retransmit algorithm
                                                                                                              • TCP GBN or Selective Repeat
                                                                                                              • Chapter 3 outline
                                                                                                              • TCP Flow Control
                                                                                                              • TCP Flow Control
                                                                                                              • TCP segment structure
                                                                                                              • TCP Flow control how it works
                                                                                                              • Technical Issue
                                                                                                              • Chapter 3 outline
                                                                                                              • TCP Connection Management
                                                                                                              • TCP Connection Management (cont)
                                                                                                              • TCP Connection Management (cont)
                                                                                                              • TCP Connection Management (cont)
                                                                                                              • TCP Connection Management (cont)
                                                                                                              • A few special cases
                                                                                                              • Chapter 3 outline
                                                                                                              • Principles of Congestion Control
                                                                                                              • Causescosts of congestion scenario 1
                                                                                                              • Causescosts of congestion scenario 2
                                                                                                              • Causescosts of congestion scenario 3
                                                                                                              • Causescosts of congestion scenario 3
                                                                                                              • Approaches towards congestion control
                                                                                                              • Case study ATM ABR congestion control
                                                                                                              • Case study ATM ABR congestion control
                                                                                                              • Chapter 3 outline
                                                                                                              • TCP Congestion Control
                                                                                                              • TCP AIMD
                                                                                                              • TCP Slow Start
                                                                                                              • TCP Slow Start (more)
                                                                                                              • Summary TCP Congestion Control
                                                                                                              • The Big Picture
                                                                                                              • TCP sender congestion control
                                                                                                              • TCP throughput
                                                                                                              • TCP Futures
                                                                                                              • TCP Fairness
                                                                                                              • Why is TCP fair
                                                                                                              • Fairness (more)
                                                                                                              • TCP Latency Modeling
                                                                                                              • Fixed Congestion Window (W)
                                                                                                              • Fixed congestion window (1)
                                                                                                              • Fixed congestion window (2)
                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                              • TCP Latency Modeling (3)
                                                                                                              • TCP Latency Modeling (4)
                                                                                                              • HTTP Modeling
                                                                                                              • Chapter 3 Summary

                                                                                                                3 Transport Layer 56Comp 361 Spring 2005

                                                                                                                Selective repeatdilemma

                                                                                                                Example seq rsquos 0 1 2 3window size=3

                                                                                                                receiver sees no difference in two scenariosincorrectly passes duplicate data as new in (a)

                                                                                                                Q what is relationship between seq size and window size

                                                                                                                3 Transport Layer 57Comp 361 Spring 2005

                                                                                                                Chapter 3 outline

                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                35 Connection-oriented transport TCP

                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                3 Transport Layer 58Comp 361 Spring 2005

                                                                                                                TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                                full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                                connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                                flow controlledsender will not overwhelm receiver

                                                                                                                point-to-pointone sender one receiver

                                                                                                                reliable in-order byte steam

                                                                                                                no ldquomessage boundariesrdquopipelined

                                                                                                                TCP congestion and flow control set window size

                                                                                                                send amp receive buffers

                                                                                                                socketdoor

                                                                                                                TCPsend buffer

                                                                                                                TCPreceive buffer

                                                                                                                socketdoor

                                                                                                                segment

                                                                                                                applicationwrites data

                                                                                                                applicationreads data

                                                                                                                3 Transport Layer 59Comp 361 Spring 2005

                                                                                                                More TCP DetailsMaximum Segment Size (MSS)

                                                                                                                Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                                Application Data + TCP Header = TCP Segment

                                                                                                                Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                                (again no payload)Client responds with third special segment

                                                                                                                This can contain payload

                                                                                                                3 Transport Layer 60Comp 361 Spring 2005

                                                                                                                Even More TCP Details

                                                                                                                A TCP connection between client and server creates in both client and server

                                                                                                                (i) buffers(ii) variables and

                                                                                                                (iii) a socket connection to process

                                                                                                                TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                                any of the network elements between the host and server

                                                                                                                3 Transport Layer 61Comp 361 Spring 2005

                                                                                                                TCP segment structure

                                                                                                                source port dest port

                                                                                                                32 bits

                                                                                                                applicationdata

                                                                                                                (variable length)

                                                                                                                sequence numberacknowledgement number

                                                                                                                Receive windowUrg data pnterchecksum

                                                                                                                FSRPAUheadlen

                                                                                                                notused

                                                                                                                Options (variable length)

                                                                                                                URG urgent data (generally not used)

                                                                                                                ACK ACK valid

                                                                                                                PSH push data now(generally not used)

                                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                                commands)

                                                                                                                bytes rcvr willingto accept

                                                                                                                Internetchecksum

                                                                                                                (as in UDP)

                                                                                                                countingby bytes of data(not segments)

                                                                                                                3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                TCP seq rsquos and ACKsSeq rsquos

                                                                                                                byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                Q how receiver handles out-of-order segments

                                                                                                                A TCP spec doesnrsquot say - up to implementer

                                                                                                                Host BHost A

                                                                                                                Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                Seq=43 ACK=80

                                                                                                                Usertypes

                                                                                                                lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                back lsquoCrsquo

                                                                                                                host ACKsreceipt

                                                                                                                of echoedlsquoCrsquo

                                                                                                                timesimple telnet scenario

                                                                                                                3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                TCP Round Trip Time and Timeout

                                                                                                                Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                average several recent measurements not just current SampleRTT

                                                                                                                Q how to set TCP timeout valuelonger than RTT

                                                                                                                but RTT variestoo short premature timeout

                                                                                                                unnecessary retransmissions

                                                                                                                too long slow reaction to segment loss

                                                                                                                3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                TCP Round Trip Time and Timeout

                                                                                                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                100

                                                                                                                150

                                                                                                                200

                                                                                                                250

                                                                                                                300

                                                                                                                350

                                                                                                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                time (seconnds)

                                                                                                                RTT

                                                                                                                (mill

                                                                                                                iseco

                                                                                                                nds)

                                                                                                                SampleRTT Estimated RTT

                                                                                                                3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                TCP Round Trip Time and Timeout

                                                                                                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                (typically β = 025)

                                                                                                                Then set timeout interval

                                                                                                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                Chapter 3 outline

                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                35 Connection-oriented transport TCP

                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                TCP reliable data transfer

                                                                                                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                Retransmissions are triggered by

                                                                                                                timeout eventsduplicate acks

                                                                                                                Initially consider simplified TCP sender

                                                                                                                ignore duplicate acksignore flow control congestion control

                                                                                                                3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                TCP sender eventsdata rcvd from app

                                                                                                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                TCP sender(simplified)

                                                                                                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                loop (forever) switch(event)

                                                                                                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                smallest sequence numberstart timer

                                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                start timer

                                                                                                                end of loop forever

                                                                                                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                TCP retransmission scenariosHost A

                                                                                                                Seq=100 20 bytes data

                                                                                                                ACK=100

                                                                                                                timepremature timeout

                                                                                                                Host B

                                                                                                                Seq=92 8 bytes data

                                                                                                                ACK=120

                                                                                                                Seq=92 8 bytes data

                                                                                                                Seq=

                                                                                                                92 t

                                                                                                                imeo

                                                                                                                ut

                                                                                                                ACK=120

                                                                                                                Host A

                                                                                                                Seq=92 8 bytes data

                                                                                                                ACK=100

                                                                                                                loss

                                                                                                                tim

                                                                                                                eout

                                                                                                                lost ACK scenario

                                                                                                                Host B

                                                                                                                X

                                                                                                                Seq=92 8 bytes data

                                                                                                                ACK=100

                                                                                                                time

                                                                                                                SendBase= 120

                                                                                                                SendBase= 120

                                                                                                                Sendbase= 100

                                                                                                                Seq=

                                                                                                                92 t

                                                                                                                imeo

                                                                                                                utSendBase

                                                                                                                = 100

                                                                                                                3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                TCP retransmission scenarios (more)Host A

                                                                                                                Seq=92 8 bytes data

                                                                                                                ACK=100

                                                                                                                loss

                                                                                                                tim

                                                                                                                eout

                                                                                                                Cumulative ACK scenario

                                                                                                                Host B

                                                                                                                X

                                                                                                                Seq=100 20 bytes data

                                                                                                                ACK=120

                                                                                                                time

                                                                                                                SendBase= 120

                                                                                                                3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                Event at Receiver

                                                                                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                Arrival of segment that partially or completely fills gap

                                                                                                                TCP Receiver action

                                                                                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                More on Sender Policies

                                                                                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                Fast Retransmit

                                                                                                                Time-out period often relatively long

                                                                                                                long delay before resending lost packet

                                                                                                                Detect lost segments via duplicate ACKs

                                                                                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                fast retransmit resend segment before timer expires

                                                                                                                3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                Fast retransmit algorithm

                                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                start timer

                                                                                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                resend segment with sequence number y

                                                                                                                a duplicate ACK for already ACKed segment

                                                                                                                fast retransmit

                                                                                                                3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                TCP GBN or Selective Repeat

                                                                                                                Basic TCP looks a lot like GBN

                                                                                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                This looks a lot like Selective Repeat

                                                                                                                TCP is a hybrid

                                                                                                                3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                Chapter 3 outline

                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                35 Connection-oriented transport TCP

                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                TCP Flow Control

                                                                                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                transmitting too muchtoo fast

                                                                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                app process may be slow at reading from buffer

                                                                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                TCP segment structure

                                                                                                                source port dest port

                                                                                                                32 bits

                                                                                                                applicationdata

                                                                                                                (variable length)

                                                                                                                sequence numberacknowledgement number

                                                                                                                Receive windowUrg data pnterchecksum

                                                                                                                FSRPAUheadlen

                                                                                                                notused

                                                                                                                Options (variable length)

                                                                                                                URG urgent data (generally not used)

                                                                                                                ACK ACK valid

                                                                                                                PSH push data now(generally not used)

                                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                                commands)

                                                                                                                bytes rcvr willingto accept

                                                                                                                Internetchecksum

                                                                                                                (as in UDP)

                                                                                                                countingby bytes of data(not segments)

                                                                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                TCP Flow control how it works

                                                                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                LastByteRead]

                                                                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                guarantees receive buffer doesnrsquot overflow

                                                                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                Technical Issue

                                                                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                Note on UDP

                                                                                                                UDP has no flow control

                                                                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                Chapter 3 outline

                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                35 Connection-oriented transport TCP

                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                TCP Connection Management

                                                                                                                Three way handshakeStep 1 client end system sends

                                                                                                                TCP SYN control segment to server

                                                                                                                specifies client_isn the initial seq No application data

                                                                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                TCP Connection Management (cont)

                                                                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                Allocate buffersAllocates buffersCan include application data

                                                                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                                                                server

                                                                                                                Connection granted (SYN=1 server_isn

                                                                                                                ACK (SYN=0 seq=client_isn+1)

                                                                                                                ack=client_isn+1)

                                                                                                                ack=server_isn+1

                                                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                TCP Connection Management (cont)

                                                                                                                Closing a connection

                                                                                                                client closes socketclientSocketclose()

                                                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                client

                                                                                                                FIN

                                                                                                                server

                                                                                                                ACK

                                                                                                                ACK

                                                                                                                FIN

                                                                                                                close

                                                                                                                close

                                                                                                                closed

                                                                                                                tim

                                                                                                                ed w

                                                                                                                ait

                                                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                TCP Connection Management (cont)

                                                                                                                Step 3 client receives FIN replies with ACK

                                                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                Closes down after timed-wait

                                                                                                                Step 4 server receives ACK Connection closed

                                                                                                                Note with small modification can handle simultaneous FINs

                                                                                                                client

                                                                                                                FIN

                                                                                                                server

                                                                                                                ACK

                                                                                                                ACK

                                                                                                                FIN

                                                                                                                closing

                                                                                                                closing

                                                                                                                closed

                                                                                                                tim

                                                                                                                ed w

                                                                                                                ait

                                                                                                                closed

                                                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                TCP Connection Management (cont)

                                                                                                                ExampleTCP serverlifecycle

                                                                                                                Example TCP clientlifecycle

                                                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                A few special cases

                                                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                Chapter 3 outline

                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                35 Connection-oriented transport TCP

                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                Principles of Congestion Control

                                                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                a top-10 problem

                                                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                large delays when congestedmaximum achievable throughput

                                                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                Causescosts of congestion scenario 2

                                                                                                                one router finite buffers sender retransmission of lost packet

                                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                λin λout=

                                                                                                                λin λoutgtλ

                                                                                                                inλout

                                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                (c)(a) (b)

                                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                λin

                                                                                                                Q what happens as and increase λ

                                                                                                                in

                                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                Causescosts of congestion scenario 3

                                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                Approaches towards congestion control

                                                                                                                Two broad approaches towards congestion control

                                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                Case study ATM ABR congestion control

                                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                                small exception ndash see next page

                                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                sender should use available bandwidth

                                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                Case study ATM ABR congestion control

                                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                Chapter 3 outline

                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                35 Connection-oriented transport TCP

                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                Congwin

                                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                                throughput = w MSSRTT Bytessec

                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                cut CongWin in half after loss event

                                                                                                                8 Kbytes

                                                                                                                16 Kbytes

                                                                                                                24 Kbytes

                                                                                                                time

                                                                                                                congestionwindow

                                                                                                                Long-lived TCP connection

                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                TCP Slow Start

                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                TCP Slow Start (more)

                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                Host A

                                                                                                                one segment

                                                                                                                RTT

                                                                                                                Host B

                                                                                                                time

                                                                                                                two segments

                                                                                                                four segments

                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                Summary TCP Congestion Control

                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                The Big Picture

                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                ACK receipt for previously unackeddata

                                                                                                                Slow Start (SS)

                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                ACK receipt for previously unackeddata

                                                                                                                CongestionAvoidance (CA)

                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                Enter slow start

                                                                                                                Duplicate ACK

                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                CongWin and Threshold not changed

                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                TCP throughput

                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                TCP Futures

                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                LRTTMSSsdot221

                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                TCP connection 1

                                                                                                                bottleneckrouter

                                                                                                                capacity R

                                                                                                                TCP connection 2

                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                R

                                                                                                                R

                                                                                                                equal bandwidth share

                                                                                                                Connection 1 throughput

                                                                                                                Conn

                                                                                                                ecti

                                                                                                                on 2

                                                                                                                thr

                                                                                                                ough

                                                                                                                p ut

                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                Multimedia apps often do not use TCP

                                                                                                                do not want rate throttled by congestion control

                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                modeling slow start

                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                Fixed congestion window (1)

                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                latency = 2RTT + OR

                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                Fixed congestion window (2)

                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                Will show that the delay for one object is

                                                                                                                RS

                                                                                                                RSRTTP

                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                - and K is the number of windows that cover the object

                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                RTT

                                                                                                                initiate TCPconnection

                                                                                                                requestobject

                                                                                                                first window= SR

                                                                                                                second window= 2SR

                                                                                                                third window= 4SR

                                                                                                                fourth window= 8SR

                                                                                                                completetransmissionobject

                                                                                                                delivered

                                                                                                                time atclient

                                                                                                                time atserver

                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                Server idles P=2 times

                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                Server idles P = minK-1Q times

                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                TCP Latency Modeling (3)

                                                                                                                ementacknowledg receivesserver until

                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                RS

                                                                                                                RSRTTPRTT

                                                                                                                RO

                                                                                                                RSRTT

                                                                                                                RSRTT

                                                                                                                RO

                                                                                                                idleTimeRTTRO

                                                                                                                P

                                                                                                                kP

                                                                                                                k

                                                                                                                P

                                                                                                                pp

                                                                                                                )12(][2

                                                                                                                ]2[2

                                                                                                                2delay

                                                                                                                1

                                                                                                                1

                                                                                                                1

                                                                                                                minusminus+++=

                                                                                                                minus+++=

                                                                                                                ++=

                                                                                                                minus

                                                                                                                =

                                                                                                                =

                                                                                                                sum

                                                                                                                sum

                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                RS k =⎥⎦

                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                +minus

                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                RSk

                                                                                                                RTT

                                                                                                                initiate TCPconnection

                                                                                                                requestobject

                                                                                                                first window= SR

                                                                                                                second window= 2SR

                                                                                                                third window= 4SR

                                                                                                                fourth window= 8SR

                                                                                                                completetransmissionobject

                                                                                                                delivered

                                                                                                                time atclient

                                                                                                                time atserver

                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                How do we calculate K

                                                                                                                ⎥⎥⎤

                                                                                                                ⎢⎢⎡ +=

                                                                                                                +ge=

                                                                                                                geminus=

                                                                                                                ge+++=

                                                                                                                ge+++=minus

                                                                                                                minus

                                                                                                                )1(log

                                                                                                                )1(logmin

                                                                                                                12min

                                                                                                                222min222min

                                                                                                                2

                                                                                                                2

                                                                                                                110

                                                                                                                110

                                                                                                                SO

                                                                                                                SOkk

                                                                                                                SOk

                                                                                                                SOkOSSSkK

                                                                                                                k

                                                                                                                k

                                                                                                                k

                                                                                                                L

                                                                                                                L

                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                02468

                                                                                                                101214161820

                                                                                                                28Kbps

                                                                                                                100Kbps

                                                                                                                1 Mbps 10Mbps

                                                                                                                non-persistent

                                                                                                                persistent

                                                                                                                parallel non-persistent

                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                HTTP Response time (in seconds)

                                                                                                                0

                                                                                                                10

                                                                                                                20

                                                                                                                30

                                                                                                                40

                                                                                                                50

                                                                                                                60

                                                                                                                70

                                                                                                                28Kbps

                                                                                                                100Kbps

                                                                                                                1 Mbps 10Mbps

                                                                                                                non-persistent

                                                                                                                persistent

                                                                                                                parallel non-persistent

                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                instantiation and implementation in the Internet

                                                                                                                UDPTCP

                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                • Chapter 3 outline
                                                                                                                • Transport services and protocols
                                                                                                                • Transport vs network layer
                                                                                                                • Transport-layer protocols
                                                                                                                • Chapter 3 outline
                                                                                                                • Multiplexingdemultiplexing
                                                                                                                • Multiplexingdemultiplexing
                                                                                                                • How demultiplexing works
                                                                                                                • Connectionless demultiplexing
                                                                                                                • Connectionless demux (cont)
                                                                                                                • Connection-oriented demux
                                                                                                                • Connection-oriented demux (cont)
                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                • Chapter 3 outline
                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                • UDP more
                                                                                                                • UDP checksum
                                                                                                                • Chapter 3 outline
                                                                                                                • Principles of Reliable data transfer
                                                                                                                • Reliable data transfer getting started
                                                                                                                • Reliable data transfer getting started
                                                                                                                • Incremental Improvements
                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                • Rdt20 channel with bit errors
                                                                                                                • rdt20 FSM specification
                                                                                                                • rdt20 operation with no errors
                                                                                                                • rdt20 error scenario
                                                                                                                • rdt20 has a fatal flaw
                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                • rdt21 discussion
                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                • rdt22 sender receiver fragments
                                                                                                                • rdt30 channels with errors and loss
                                                                                                                • rdt30 sender
                                                                                                                • rdt30 in action
                                                                                                                • rdt30 in action
                                                                                                                • Performance of rdt30
                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                • Pipelined protocols
                                                                                                                • Pipelined protocols
                                                                                                                • Pipelining increased utilization
                                                                                                                • Go-Back-N
                                                                                                                • GBN Sender
                                                                                                                • GBN sender extended FSM
                                                                                                                • GBN receiver extended FSM
                                                                                                                • More on receiver
                                                                                                                • GBN inaction
                                                                                                                • Selective Repeat
                                                                                                                • Selective repeat sender receiver windows
                                                                                                                • Selective repeat
                                                                                                                • Selective repeat in action
                                                                                                                • Selective repeat dilemma
                                                                                                                • Chapter 3 outline
                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                • More TCP Details
                                                                                                                • Even More TCP Details
                                                                                                                • TCP segment structure
                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                • Example RTT estimation
                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                • Chapter 3 outline
                                                                                                                • TCP reliable data transfer
                                                                                                                • TCP sender events
                                                                                                                • TCP sender(simplified)
                                                                                                                • TCP retransmission scenarios
                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                • More on Sender Policies
                                                                                                                • Fast Retransmit
                                                                                                                • Fast retransmit algorithm
                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                • Chapter 3 outline
                                                                                                                • TCP Flow Control
                                                                                                                • TCP Flow Control
                                                                                                                • TCP segment structure
                                                                                                                • TCP Flow control how it works
                                                                                                                • Technical Issue
                                                                                                                • Chapter 3 outline
                                                                                                                • TCP Connection Management
                                                                                                                • TCP Connection Management (cont)
                                                                                                                • TCP Connection Management (cont)
                                                                                                                • TCP Connection Management (cont)
                                                                                                                • TCP Connection Management (cont)
                                                                                                                • A few special cases
                                                                                                                • Chapter 3 outline
                                                                                                                • Principles of Congestion Control
                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                • Approaches towards congestion control
                                                                                                                • Case study ATM ABR congestion control
                                                                                                                • Case study ATM ABR congestion control
                                                                                                                • Chapter 3 outline
                                                                                                                • TCP Congestion Control
                                                                                                                • TCP AIMD
                                                                                                                • TCP Slow Start
                                                                                                                • TCP Slow Start (more)
                                                                                                                • Summary TCP Congestion Control
                                                                                                                • The Big Picture
                                                                                                                • TCP sender congestion control
                                                                                                                • TCP throughput
                                                                                                                • TCP Futures
                                                                                                                • TCP Fairness
                                                                                                                • Why is TCP fair
                                                                                                                • Fairness (more)
                                                                                                                • TCP Latency Modeling
                                                                                                                • Fixed Congestion Window (W)
                                                                                                                • Fixed congestion window (1)
                                                                                                                • Fixed congestion window (2)
                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                • TCP Latency Modeling (3)
                                                                                                                • TCP Latency Modeling (4)
                                                                                                                • HTTP Modeling
                                                                                                                • Chapter 3 Summary

                                                                                                                  3 Transport Layer 57Comp 361 Spring 2005

                                                                                                                  Chapter 3 outline

                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                  3 Transport Layer 58Comp 361 Spring 2005

                                                                                                                  TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                                  full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                                  connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                                  flow controlledsender will not overwhelm receiver

                                                                                                                  point-to-pointone sender one receiver

                                                                                                                  reliable in-order byte steam

                                                                                                                  no ldquomessage boundariesrdquopipelined

                                                                                                                  TCP congestion and flow control set window size

                                                                                                                  send amp receive buffers

                                                                                                                  socketdoor

                                                                                                                  TCPsend buffer

                                                                                                                  TCPreceive buffer

                                                                                                                  socketdoor

                                                                                                                  segment

                                                                                                                  applicationwrites data

                                                                                                                  applicationreads data

                                                                                                                  3 Transport Layer 59Comp 361 Spring 2005

                                                                                                                  More TCP DetailsMaximum Segment Size (MSS)

                                                                                                                  Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                                  Application Data + TCP Header = TCP Segment

                                                                                                                  Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                                  (again no payload)Client responds with third special segment

                                                                                                                  This can contain payload

                                                                                                                  3 Transport Layer 60Comp 361 Spring 2005

                                                                                                                  Even More TCP Details

                                                                                                                  A TCP connection between client and server creates in both client and server

                                                                                                                  (i) buffers(ii) variables and

                                                                                                                  (iii) a socket connection to process

                                                                                                                  TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                                  any of the network elements between the host and server

                                                                                                                  3 Transport Layer 61Comp 361 Spring 2005

                                                                                                                  TCP segment structure

                                                                                                                  source port dest port

                                                                                                                  32 bits

                                                                                                                  applicationdata

                                                                                                                  (variable length)

                                                                                                                  sequence numberacknowledgement number

                                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                                  FSRPAUheadlen

                                                                                                                  notused

                                                                                                                  Options (variable length)

                                                                                                                  URG urgent data (generally not used)

                                                                                                                  ACK ACK valid

                                                                                                                  PSH push data now(generally not used)

                                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                                  commands)

                                                                                                                  bytes rcvr willingto accept

                                                                                                                  Internetchecksum

                                                                                                                  (as in UDP)

                                                                                                                  countingby bytes of data(not segments)

                                                                                                                  3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                  TCP seq rsquos and ACKsSeq rsquos

                                                                                                                  byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                  ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                  Q how receiver handles out-of-order segments

                                                                                                                  A TCP spec doesnrsquot say - up to implementer

                                                                                                                  Host BHost A

                                                                                                                  Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                  Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                  Seq=43 ACK=80

                                                                                                                  Usertypes

                                                                                                                  lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                  back lsquoCrsquo

                                                                                                                  host ACKsreceipt

                                                                                                                  of echoedlsquoCrsquo

                                                                                                                  timesimple telnet scenario

                                                                                                                  3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                  TCP Round Trip Time and Timeout

                                                                                                                  Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                  ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                  average several recent measurements not just current SampleRTT

                                                                                                                  Q how to set TCP timeout valuelonger than RTT

                                                                                                                  but RTT variestoo short premature timeout

                                                                                                                  unnecessary retransmissions

                                                                                                                  too long slow reaction to segment loss

                                                                                                                  3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                  TCP Round Trip Time and Timeout

                                                                                                                  EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                  Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                  3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                  100

                                                                                                                  150

                                                                                                                  200

                                                                                                                  250

                                                                                                                  300

                                                                                                                  350

                                                                                                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                  time (seconnds)

                                                                                                                  RTT

                                                                                                                  (mill

                                                                                                                  iseco

                                                                                                                  nds)

                                                                                                                  SampleRTT Estimated RTT

                                                                                                                  3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                  TCP Round Trip Time and Timeout

                                                                                                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                  (typically β = 025)

                                                                                                                  Then set timeout interval

                                                                                                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                  3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                  Chapter 3 outline

                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                  3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                  TCP reliable data transfer

                                                                                                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                  Retransmissions are triggered by

                                                                                                                  timeout eventsduplicate acks

                                                                                                                  Initially consider simplified TCP sender

                                                                                                                  ignore duplicate acksignore flow control congestion control

                                                                                                                  3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                  TCP sender eventsdata rcvd from app

                                                                                                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                  timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                  Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                  update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                  TCP sender(simplified)

                                                                                                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                  loop (forever) switch(event)

                                                                                                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                  smallest sequence numberstart timer

                                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                  start timer

                                                                                                                  end of loop forever

                                                                                                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                  3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                  3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                  TCP retransmission scenariosHost A

                                                                                                                  Seq=100 20 bytes data

                                                                                                                  ACK=100

                                                                                                                  timepremature timeout

                                                                                                                  Host B

                                                                                                                  Seq=92 8 bytes data

                                                                                                                  ACK=120

                                                                                                                  Seq=92 8 bytes data

                                                                                                                  Seq=

                                                                                                                  92 t

                                                                                                                  imeo

                                                                                                                  ut

                                                                                                                  ACK=120

                                                                                                                  Host A

                                                                                                                  Seq=92 8 bytes data

                                                                                                                  ACK=100

                                                                                                                  loss

                                                                                                                  tim

                                                                                                                  eout

                                                                                                                  lost ACK scenario

                                                                                                                  Host B

                                                                                                                  X

                                                                                                                  Seq=92 8 bytes data

                                                                                                                  ACK=100

                                                                                                                  time

                                                                                                                  SendBase= 120

                                                                                                                  SendBase= 120

                                                                                                                  Sendbase= 100

                                                                                                                  Seq=

                                                                                                                  92 t

                                                                                                                  imeo

                                                                                                                  utSendBase

                                                                                                                  = 100

                                                                                                                  3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                  TCP retransmission scenarios (more)Host A

                                                                                                                  Seq=92 8 bytes data

                                                                                                                  ACK=100

                                                                                                                  loss

                                                                                                                  tim

                                                                                                                  eout

                                                                                                                  Cumulative ACK scenario

                                                                                                                  Host B

                                                                                                                  X

                                                                                                                  Seq=100 20 bytes data

                                                                                                                  ACK=120

                                                                                                                  time

                                                                                                                  SendBase= 120

                                                                                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                  Event at Receiver

                                                                                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                  Arrival of segment that partially or completely fills gap

                                                                                                                  TCP Receiver action

                                                                                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                  More on Sender Policies

                                                                                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                  Fast Retransmit

                                                                                                                  Time-out period often relatively long

                                                                                                                  long delay before resending lost packet

                                                                                                                  Detect lost segments via duplicate ACKs

                                                                                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                  fast retransmit resend segment before timer expires

                                                                                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                  Fast retransmit algorithm

                                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                  start timer

                                                                                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                  resend segment with sequence number y

                                                                                                                  a duplicate ACK for already ACKed segment

                                                                                                                  fast retransmit

                                                                                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                  TCP GBN or Selective Repeat

                                                                                                                  Basic TCP looks a lot like GBN

                                                                                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                  This looks a lot like Selective Repeat

                                                                                                                  TCP is a hybrid

                                                                                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                  Chapter 3 outline

                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                  TCP Flow Control

                                                                                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                  transmitting too muchtoo fast

                                                                                                                  flow controlreceive side of TCP connection has a receive buffer

                                                                                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                  app process may be slow at reading from buffer

                                                                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                  TCP segment structure

                                                                                                                  source port dest port

                                                                                                                  32 bits

                                                                                                                  applicationdata

                                                                                                                  (variable length)

                                                                                                                  sequence numberacknowledgement number

                                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                                  FSRPAUheadlen

                                                                                                                  notused

                                                                                                                  Options (variable length)

                                                                                                                  URG urgent data (generally not used)

                                                                                                                  ACK ACK valid

                                                                                                                  PSH push data now(generally not used)

                                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                                  commands)

                                                                                                                  bytes rcvr willingto accept

                                                                                                                  Internetchecksum

                                                                                                                  (as in UDP)

                                                                                                                  countingby bytes of data(not segments)

                                                                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                  TCP Flow control how it works

                                                                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                  LastByteRead]

                                                                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                  guarantees receive buffer doesnrsquot overflow

                                                                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                  Technical Issue

                                                                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                  Note on UDP

                                                                                                                  UDP has no flow control

                                                                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                  Chapter 3 outline

                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                  TCP Connection Management

                                                                                                                  Three way handshakeStep 1 client end system sends

                                                                                                                  TCP SYN control segment to server

                                                                                                                  specifies client_isn the initial seq No application data

                                                                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                  TCP Connection Management (cont)

                                                                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                  Allocate buffersAllocates buffersCan include application data

                                                                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                                                                  server

                                                                                                                  Connection granted (SYN=1 server_isn

                                                                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                                                                  ack=client_isn+1)

                                                                                                                  ack=server_isn+1

                                                                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                  TCP Connection Management (cont)

                                                                                                                  Closing a connection

                                                                                                                  client closes socketclientSocketclose()

                                                                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                  client

                                                                                                                  FIN

                                                                                                                  server

                                                                                                                  ACK

                                                                                                                  ACK

                                                                                                                  FIN

                                                                                                                  close

                                                                                                                  close

                                                                                                                  closed

                                                                                                                  tim

                                                                                                                  ed w

                                                                                                                  ait

                                                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                  TCP Connection Management (cont)

                                                                                                                  Step 3 client receives FIN replies with ACK

                                                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                  Closes down after timed-wait

                                                                                                                  Step 4 server receives ACK Connection closed

                                                                                                                  Note with small modification can handle simultaneous FINs

                                                                                                                  client

                                                                                                                  FIN

                                                                                                                  server

                                                                                                                  ACK

                                                                                                                  ACK

                                                                                                                  FIN

                                                                                                                  closing

                                                                                                                  closing

                                                                                                                  closed

                                                                                                                  tim

                                                                                                                  ed w

                                                                                                                  ait

                                                                                                                  closed

                                                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                  TCP Connection Management (cont)

                                                                                                                  ExampleTCP serverlifecycle

                                                                                                                  Example TCP clientlifecycle

                                                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                  A few special cases

                                                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                  Chapter 3 outline

                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                  Principles of Congestion Control

                                                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                  a top-10 problem

                                                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                  large delays when congestedmaximum achievable throughput

                                                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                  Causescosts of congestion scenario 2

                                                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                  λin λout=

                                                                                                                  λin λoutgtλ

                                                                                                                  inλout

                                                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                  (c)(a) (b)

                                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                  λin

                                                                                                                  Q what happens as and increase λ

                                                                                                                  in

                                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                  Causescosts of congestion scenario 3

                                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                  Approaches towards congestion control

                                                                                                                  Two broad approaches towards congestion control

                                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                  Case study ATM ABR congestion control

                                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                                  small exception ndash see next page

                                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                  sender should use available bandwidth

                                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                  Case study ATM ABR congestion control

                                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                  Chapter 3 outline

                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                  Congwin

                                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                  cut CongWin in half after loss event

                                                                                                                  8 Kbytes

                                                                                                                  16 Kbytes

                                                                                                                  24 Kbytes

                                                                                                                  time

                                                                                                                  congestionwindow

                                                                                                                  Long-lived TCP connection

                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                  TCP Slow Start

                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                  TCP Slow Start (more)

                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                  Host A

                                                                                                                  one segment

                                                                                                                  RTT

                                                                                                                  Host B

                                                                                                                  time

                                                                                                                  two segments

                                                                                                                  four segments

                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                  Summary TCP Congestion Control

                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                  The Big Picture

                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                  Slow Start (SS)

                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                  CongestionAvoidance (CA)

                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                  Enter slow start

                                                                                                                  Duplicate ACK

                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                  CongWin and Threshold not changed

                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                  TCP throughput

                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                  TCP Futures

                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                  LRTTMSSsdot221

                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                  TCP connection 1

                                                                                                                  bottleneckrouter

                                                                                                                  capacity R

                                                                                                                  TCP connection 2

                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                  R

                                                                                                                  R

                                                                                                                  equal bandwidth share

                                                                                                                  Connection 1 throughput

                                                                                                                  Conn

                                                                                                                  ecti

                                                                                                                  on 2

                                                                                                                  thr

                                                                                                                  ough

                                                                                                                  p ut

                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                  do not want rate throttled by congestion control

                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                  modeling slow start

                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                  Fixed congestion window (1)

                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                  latency = 2RTT + OR

                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                  Fixed congestion window (2)

                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                  Will show that the delay for one object is

                                                                                                                  RS

                                                                                                                  RSRTTP

                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                  RTT

                                                                                                                  initiate TCPconnection

                                                                                                                  requestobject

                                                                                                                  first window= SR

                                                                                                                  second window= 2SR

                                                                                                                  third window= 4SR

                                                                                                                  fourth window= 8SR

                                                                                                                  completetransmissionobject

                                                                                                                  delivered

                                                                                                                  time atclient

                                                                                                                  time atserver

                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                  Server idles P=2 times

                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                  Server idles P = minK-1Q times

                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                  TCP Latency Modeling (3)

                                                                                                                  ementacknowledg receivesserver until

                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                  RS

                                                                                                                  RSRTTPRTT

                                                                                                                  RO

                                                                                                                  RSRTT

                                                                                                                  RSRTT

                                                                                                                  RO

                                                                                                                  idleTimeRTTRO

                                                                                                                  P

                                                                                                                  kP

                                                                                                                  k

                                                                                                                  P

                                                                                                                  pp

                                                                                                                  )12(][2

                                                                                                                  ]2[2

                                                                                                                  2delay

                                                                                                                  1

                                                                                                                  1

                                                                                                                  1

                                                                                                                  minusminus+++=

                                                                                                                  minus+++=

                                                                                                                  ++=

                                                                                                                  minus

                                                                                                                  =

                                                                                                                  =

                                                                                                                  sum

                                                                                                                  sum

                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                  RS k =⎥⎦

                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                  +minus

                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                  RSk

                                                                                                                  RTT

                                                                                                                  initiate TCPconnection

                                                                                                                  requestobject

                                                                                                                  first window= SR

                                                                                                                  second window= 2SR

                                                                                                                  third window= 4SR

                                                                                                                  fourth window= 8SR

                                                                                                                  completetransmissionobject

                                                                                                                  delivered

                                                                                                                  time atclient

                                                                                                                  time atserver

                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                  How do we calculate K

                                                                                                                  ⎥⎥⎤

                                                                                                                  ⎢⎢⎡ +=

                                                                                                                  +ge=

                                                                                                                  geminus=

                                                                                                                  ge+++=

                                                                                                                  ge+++=minus

                                                                                                                  minus

                                                                                                                  )1(log

                                                                                                                  )1(logmin

                                                                                                                  12min

                                                                                                                  222min222min

                                                                                                                  2

                                                                                                                  2

                                                                                                                  110

                                                                                                                  110

                                                                                                                  SO

                                                                                                                  SOkk

                                                                                                                  SOk

                                                                                                                  SOkOSSSkK

                                                                                                                  k

                                                                                                                  k

                                                                                                                  k

                                                                                                                  L

                                                                                                                  L

                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                  02468

                                                                                                                  101214161820

                                                                                                                  28Kbps

                                                                                                                  100Kbps

                                                                                                                  1 Mbps 10Mbps

                                                                                                                  non-persistent

                                                                                                                  persistent

                                                                                                                  parallel non-persistent

                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                  HTTP Response time (in seconds)

                                                                                                                  0

                                                                                                                  10

                                                                                                                  20

                                                                                                                  30

                                                                                                                  40

                                                                                                                  50

                                                                                                                  60

                                                                                                                  70

                                                                                                                  28Kbps

                                                                                                                  100Kbps

                                                                                                                  1 Mbps 10Mbps

                                                                                                                  non-persistent

                                                                                                                  persistent

                                                                                                                  parallel non-persistent

                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                  instantiation and implementation in the Internet

                                                                                                                  UDPTCP

                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                  • Chapter 3 outline
                                                                                                                  • Transport services and protocols
                                                                                                                  • Transport vs network layer
                                                                                                                  • Transport-layer protocols
                                                                                                                  • Chapter 3 outline
                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                  • How demultiplexing works
                                                                                                                  • Connectionless demultiplexing
                                                                                                                  • Connectionless demux (cont)
                                                                                                                  • Connection-oriented demux
                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                  • Chapter 3 outline
                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                  • UDP more
                                                                                                                  • UDP checksum
                                                                                                                  • Chapter 3 outline
                                                                                                                  • Principles of Reliable data transfer
                                                                                                                  • Reliable data transfer getting started
                                                                                                                  • Reliable data transfer getting started
                                                                                                                  • Incremental Improvements
                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                  • rdt20 FSM specification
                                                                                                                  • rdt20 operation with no errors
                                                                                                                  • rdt20 error scenario
                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                  • rdt21 discussion
                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                  • rdt30 sender
                                                                                                                  • rdt30 in action
                                                                                                                  • rdt30 in action
                                                                                                                  • Performance of rdt30
                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                  • Pipelined protocols
                                                                                                                  • Pipelined protocols
                                                                                                                  • Pipelining increased utilization
                                                                                                                  • Go-Back-N
                                                                                                                  • GBN Sender
                                                                                                                  • GBN sender extended FSM
                                                                                                                  • GBN receiver extended FSM
                                                                                                                  • More on receiver
                                                                                                                  • GBN inaction
                                                                                                                  • Selective Repeat
                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                  • Selective repeat
                                                                                                                  • Selective repeat in action
                                                                                                                  • Selective repeat dilemma
                                                                                                                  • Chapter 3 outline
                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                  • More TCP Details
                                                                                                                  • Even More TCP Details
                                                                                                                  • TCP segment structure
                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                  • Example RTT estimation
                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                  • Chapter 3 outline
                                                                                                                  • TCP reliable data transfer
                                                                                                                  • TCP sender events
                                                                                                                  • TCP sender(simplified)
                                                                                                                  • TCP retransmission scenarios
                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                  • More on Sender Policies
                                                                                                                  • Fast Retransmit
                                                                                                                  • Fast retransmit algorithm
                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                  • Chapter 3 outline
                                                                                                                  • TCP Flow Control
                                                                                                                  • TCP Flow Control
                                                                                                                  • TCP segment structure
                                                                                                                  • TCP Flow control how it works
                                                                                                                  • Technical Issue
                                                                                                                  • Chapter 3 outline
                                                                                                                  • TCP Connection Management
                                                                                                                  • TCP Connection Management (cont)
                                                                                                                  • TCP Connection Management (cont)
                                                                                                                  • TCP Connection Management (cont)
                                                                                                                  • TCP Connection Management (cont)
                                                                                                                  • A few special cases
                                                                                                                  • Chapter 3 outline
                                                                                                                  • Principles of Congestion Control
                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                  • Approaches towards congestion control
                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                  • Chapter 3 outline
                                                                                                                  • TCP Congestion Control
                                                                                                                  • TCP AIMD
                                                                                                                  • TCP Slow Start
                                                                                                                  • TCP Slow Start (more)
                                                                                                                  • Summary TCP Congestion Control
                                                                                                                  • The Big Picture
                                                                                                                  • TCP sender congestion control
                                                                                                                  • TCP throughput
                                                                                                                  • TCP Futures
                                                                                                                  • TCP Fairness
                                                                                                                  • Why is TCP fair
                                                                                                                  • Fairness (more)
                                                                                                                  • TCP Latency Modeling
                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                  • Fixed congestion window (1)
                                                                                                                  • Fixed congestion window (2)
                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                  • HTTP Modeling
                                                                                                                  • Chapter 3 Summary

                                                                                                                    3 Transport Layer 58Comp 361 Spring 2005

                                                                                                                    TCP Overview RFCs 793 1122 1323 2018 2581

                                                                                                                    full duplex databi-directional data flow in same connectionMSS maximum segment size

                                                                                                                    connection-orientedhandshaking (exchange of control msgs) initrsquossender receiver state before data exchange

                                                                                                                    flow controlledsender will not overwhelm receiver

                                                                                                                    point-to-pointone sender one receiver

                                                                                                                    reliable in-order byte steam

                                                                                                                    no ldquomessage boundariesrdquopipelined

                                                                                                                    TCP congestion and flow control set window size

                                                                                                                    send amp receive buffers

                                                                                                                    socketdoor

                                                                                                                    TCPsend buffer

                                                                                                                    TCPreceive buffer

                                                                                                                    socketdoor

                                                                                                                    segment

                                                                                                                    applicationwrites data

                                                                                                                    applicationreads data

                                                                                                                    3 Transport Layer 59Comp 361 Spring 2005

                                                                                                                    More TCP DetailsMaximum Segment Size (MSS)

                                                                                                                    Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                                    Application Data + TCP Header = TCP Segment

                                                                                                                    Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                                    (again no payload)Client responds with third special segment

                                                                                                                    This can contain payload

                                                                                                                    3 Transport Layer 60Comp 361 Spring 2005

                                                                                                                    Even More TCP Details

                                                                                                                    A TCP connection between client and server creates in both client and server

                                                                                                                    (i) buffers(ii) variables and

                                                                                                                    (iii) a socket connection to process

                                                                                                                    TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                                    any of the network elements between the host and server

                                                                                                                    3 Transport Layer 61Comp 361 Spring 2005

                                                                                                                    TCP segment structure

                                                                                                                    source port dest port

                                                                                                                    32 bits

                                                                                                                    applicationdata

                                                                                                                    (variable length)

                                                                                                                    sequence numberacknowledgement number

                                                                                                                    Receive windowUrg data pnterchecksum

                                                                                                                    FSRPAUheadlen

                                                                                                                    notused

                                                                                                                    Options (variable length)

                                                                                                                    URG urgent data (generally not used)

                                                                                                                    ACK ACK valid

                                                                                                                    PSH push data now(generally not used)

                                                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                                                    commands)

                                                                                                                    bytes rcvr willingto accept

                                                                                                                    Internetchecksum

                                                                                                                    (as in UDP)

                                                                                                                    countingby bytes of data(not segments)

                                                                                                                    3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                    TCP seq rsquos and ACKsSeq rsquos

                                                                                                                    byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                    ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                    Q how receiver handles out-of-order segments

                                                                                                                    A TCP spec doesnrsquot say - up to implementer

                                                                                                                    Host BHost A

                                                                                                                    Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                    Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                    Seq=43 ACK=80

                                                                                                                    Usertypes

                                                                                                                    lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                    back lsquoCrsquo

                                                                                                                    host ACKsreceipt

                                                                                                                    of echoedlsquoCrsquo

                                                                                                                    timesimple telnet scenario

                                                                                                                    3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                    TCP Round Trip Time and Timeout

                                                                                                                    Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                    ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                    average several recent measurements not just current SampleRTT

                                                                                                                    Q how to set TCP timeout valuelonger than RTT

                                                                                                                    but RTT variestoo short premature timeout

                                                                                                                    unnecessary retransmissions

                                                                                                                    too long slow reaction to segment loss

                                                                                                                    3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                    TCP Round Trip Time and Timeout

                                                                                                                    EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                    Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                    3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                    Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                    100

                                                                                                                    150

                                                                                                                    200

                                                                                                                    250

                                                                                                                    300

                                                                                                                    350

                                                                                                                    1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                    time (seconnds)

                                                                                                                    RTT

                                                                                                                    (mill

                                                                                                                    iseco

                                                                                                                    nds)

                                                                                                                    SampleRTT Estimated RTT

                                                                                                                    3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                    TCP Round Trip Time and Timeout

                                                                                                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                    (typically β = 025)

                                                                                                                    Then set timeout interval

                                                                                                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                    3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                    Chapter 3 outline

                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                    3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                    TCP reliable data transfer

                                                                                                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                    Retransmissions are triggered by

                                                                                                                    timeout eventsduplicate acks

                                                                                                                    Initially consider simplified TCP sender

                                                                                                                    ignore duplicate acksignore flow control congestion control

                                                                                                                    3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                    TCP sender eventsdata rcvd from app

                                                                                                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                    timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                    Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                    update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                    TCP sender(simplified)

                                                                                                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                    loop (forever) switch(event)

                                                                                                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                    smallest sequence numberstart timer

                                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                    start timer

                                                                                                                    end of loop forever

                                                                                                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                    3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                    3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                    TCP retransmission scenariosHost A

                                                                                                                    Seq=100 20 bytes data

                                                                                                                    ACK=100

                                                                                                                    timepremature timeout

                                                                                                                    Host B

                                                                                                                    Seq=92 8 bytes data

                                                                                                                    ACK=120

                                                                                                                    Seq=92 8 bytes data

                                                                                                                    Seq=

                                                                                                                    92 t

                                                                                                                    imeo

                                                                                                                    ut

                                                                                                                    ACK=120

                                                                                                                    Host A

                                                                                                                    Seq=92 8 bytes data

                                                                                                                    ACK=100

                                                                                                                    loss

                                                                                                                    tim

                                                                                                                    eout

                                                                                                                    lost ACK scenario

                                                                                                                    Host B

                                                                                                                    X

                                                                                                                    Seq=92 8 bytes data

                                                                                                                    ACK=100

                                                                                                                    time

                                                                                                                    SendBase= 120

                                                                                                                    SendBase= 120

                                                                                                                    Sendbase= 100

                                                                                                                    Seq=

                                                                                                                    92 t

                                                                                                                    imeo

                                                                                                                    utSendBase

                                                                                                                    = 100

                                                                                                                    3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                    TCP retransmission scenarios (more)Host A

                                                                                                                    Seq=92 8 bytes data

                                                                                                                    ACK=100

                                                                                                                    loss

                                                                                                                    tim

                                                                                                                    eout

                                                                                                                    Cumulative ACK scenario

                                                                                                                    Host B

                                                                                                                    X

                                                                                                                    Seq=100 20 bytes data

                                                                                                                    ACK=120

                                                                                                                    time

                                                                                                                    SendBase= 120

                                                                                                                    3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                    TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                    Event at Receiver

                                                                                                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                    Arrival of segment that partially or completely fills gap

                                                                                                                    TCP Receiver action

                                                                                                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                    Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                    Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                    More on Sender Policies

                                                                                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                    Fast Retransmit

                                                                                                                    Time-out period often relatively long

                                                                                                                    long delay before resending lost packet

                                                                                                                    Detect lost segments via duplicate ACKs

                                                                                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                    fast retransmit resend segment before timer expires

                                                                                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                    Fast retransmit algorithm

                                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                    start timer

                                                                                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                    resend segment with sequence number y

                                                                                                                    a duplicate ACK for already ACKed segment

                                                                                                                    fast retransmit

                                                                                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                    TCP GBN or Selective Repeat

                                                                                                                    Basic TCP looks a lot like GBN

                                                                                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                    This looks a lot like Selective Repeat

                                                                                                                    TCP is a hybrid

                                                                                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                    Chapter 3 outline

                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                    TCP Flow Control

                                                                                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                    transmitting too muchtoo fast

                                                                                                                    flow controlreceive side of TCP connection has a receive buffer

                                                                                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                    app process may be slow at reading from buffer

                                                                                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                    TCP segment structure

                                                                                                                    source port dest port

                                                                                                                    32 bits

                                                                                                                    applicationdata

                                                                                                                    (variable length)

                                                                                                                    sequence numberacknowledgement number

                                                                                                                    Receive windowUrg data pnterchecksum

                                                                                                                    FSRPAUheadlen

                                                                                                                    notused

                                                                                                                    Options (variable length)

                                                                                                                    URG urgent data (generally not used)

                                                                                                                    ACK ACK valid

                                                                                                                    PSH push data now(generally not used)

                                                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                                                    commands)

                                                                                                                    bytes rcvr willingto accept

                                                                                                                    Internetchecksum

                                                                                                                    (as in UDP)

                                                                                                                    countingby bytes of data(not segments)

                                                                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                    TCP Flow control how it works

                                                                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                    LastByteRead]

                                                                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                    guarantees receive buffer doesnrsquot overflow

                                                                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                    Technical Issue

                                                                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                    Note on UDP

                                                                                                                    UDP has no flow control

                                                                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                    Chapter 3 outline

                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                    TCP Connection Management

                                                                                                                    Three way handshakeStep 1 client end system sends

                                                                                                                    TCP SYN control segment to server

                                                                                                                    specifies client_isn the initial seq No application data

                                                                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                    TCP Connection Management (cont)

                                                                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                    Allocate buffersAllocates buffersCan include application data

                                                                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                                                                    server

                                                                                                                    Connection granted (SYN=1 server_isn

                                                                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                                                                    ack=client_isn+1)

                                                                                                                    ack=server_isn+1

                                                                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                    TCP Connection Management (cont)

                                                                                                                    Closing a connection

                                                                                                                    client closes socketclientSocketclose()

                                                                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                    client

                                                                                                                    FIN

                                                                                                                    server

                                                                                                                    ACK

                                                                                                                    ACK

                                                                                                                    FIN

                                                                                                                    close

                                                                                                                    close

                                                                                                                    closed

                                                                                                                    tim

                                                                                                                    ed w

                                                                                                                    ait

                                                                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                    TCP Connection Management (cont)

                                                                                                                    Step 3 client receives FIN replies with ACK

                                                                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                    Closes down after timed-wait

                                                                                                                    Step 4 server receives ACK Connection closed

                                                                                                                    Note with small modification can handle simultaneous FINs

                                                                                                                    client

                                                                                                                    FIN

                                                                                                                    server

                                                                                                                    ACK

                                                                                                                    ACK

                                                                                                                    FIN

                                                                                                                    closing

                                                                                                                    closing

                                                                                                                    closed

                                                                                                                    tim

                                                                                                                    ed w

                                                                                                                    ait

                                                                                                                    closed

                                                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                    TCP Connection Management (cont)

                                                                                                                    ExampleTCP serverlifecycle

                                                                                                                    Example TCP clientlifecycle

                                                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                    A few special cases

                                                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                    Chapter 3 outline

                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                    Principles of Congestion Control

                                                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                    a top-10 problem

                                                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                    large delays when congestedmaximum achievable throughput

                                                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                    Causescosts of congestion scenario 2

                                                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                    λin λout=

                                                                                                                    λin λoutgtλ

                                                                                                                    inλout

                                                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                    (c)(a) (b)

                                                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                    λin

                                                                                                                    Q what happens as and increase λ

                                                                                                                    in

                                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                    Causescosts of congestion scenario 3

                                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                    Approaches towards congestion control

                                                                                                                    Two broad approaches towards congestion control

                                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                    Case study ATM ABR congestion control

                                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                                    small exception ndash see next page

                                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                    sender should use available bandwidth

                                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                    Case study ATM ABR congestion control

                                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                    Chapter 3 outline

                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                    Congwin

                                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                    cut CongWin in half after loss event

                                                                                                                    8 Kbytes

                                                                                                                    16 Kbytes

                                                                                                                    24 Kbytes

                                                                                                                    time

                                                                                                                    congestionwindow

                                                                                                                    Long-lived TCP connection

                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                    TCP Slow Start

                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                    TCP Slow Start (more)

                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                    Host A

                                                                                                                    one segment

                                                                                                                    RTT

                                                                                                                    Host B

                                                                                                                    time

                                                                                                                    two segments

                                                                                                                    four segments

                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                    Summary TCP Congestion Control

                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                    The Big Picture

                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                    Slow Start (SS)

                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                    CongestionAvoidance (CA)

                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                    Enter slow start

                                                                                                                    Duplicate ACK

                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                    CongWin and Threshold not changed

                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                    TCP throughput

                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                    TCP Futures

                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                    LRTTMSSsdot221

                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                    TCP connection 1

                                                                                                                    bottleneckrouter

                                                                                                                    capacity R

                                                                                                                    TCP connection 2

                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                    R

                                                                                                                    R

                                                                                                                    equal bandwidth share

                                                                                                                    Connection 1 throughput

                                                                                                                    Conn

                                                                                                                    ecti

                                                                                                                    on 2

                                                                                                                    thr

                                                                                                                    ough

                                                                                                                    p ut

                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                    do not want rate throttled by congestion control

                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                    modeling slow start

                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                    Fixed congestion window (1)

                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                    latency = 2RTT + OR

                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                    Fixed congestion window (2)

                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                    Will show that the delay for one object is

                                                                                                                    RS

                                                                                                                    RSRTTP

                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                    RTT

                                                                                                                    initiate TCPconnection

                                                                                                                    requestobject

                                                                                                                    first window= SR

                                                                                                                    second window= 2SR

                                                                                                                    third window= 4SR

                                                                                                                    fourth window= 8SR

                                                                                                                    completetransmissionobject

                                                                                                                    delivered

                                                                                                                    time atclient

                                                                                                                    time atserver

                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                    Server idles P=2 times

                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                    Server idles P = minK-1Q times

                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                    TCP Latency Modeling (3)

                                                                                                                    ementacknowledg receivesserver until

                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                    RS

                                                                                                                    RSRTTPRTT

                                                                                                                    RO

                                                                                                                    RSRTT

                                                                                                                    RSRTT

                                                                                                                    RO

                                                                                                                    idleTimeRTTRO

                                                                                                                    P

                                                                                                                    kP

                                                                                                                    k

                                                                                                                    P

                                                                                                                    pp

                                                                                                                    )12(][2

                                                                                                                    ]2[2

                                                                                                                    2delay

                                                                                                                    1

                                                                                                                    1

                                                                                                                    1

                                                                                                                    minusminus+++=

                                                                                                                    minus+++=

                                                                                                                    ++=

                                                                                                                    minus

                                                                                                                    =

                                                                                                                    =

                                                                                                                    sum

                                                                                                                    sum

                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                    RS k =⎥⎦

                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                    +minus

                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                    RSk

                                                                                                                    RTT

                                                                                                                    initiate TCPconnection

                                                                                                                    requestobject

                                                                                                                    first window= SR

                                                                                                                    second window= 2SR

                                                                                                                    third window= 4SR

                                                                                                                    fourth window= 8SR

                                                                                                                    completetransmissionobject

                                                                                                                    delivered

                                                                                                                    time atclient

                                                                                                                    time atserver

                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                    How do we calculate K

                                                                                                                    ⎥⎥⎤

                                                                                                                    ⎢⎢⎡ +=

                                                                                                                    +ge=

                                                                                                                    geminus=

                                                                                                                    ge+++=

                                                                                                                    ge+++=minus

                                                                                                                    minus

                                                                                                                    )1(log

                                                                                                                    )1(logmin

                                                                                                                    12min

                                                                                                                    222min222min

                                                                                                                    2

                                                                                                                    2

                                                                                                                    110

                                                                                                                    110

                                                                                                                    SO

                                                                                                                    SOkk

                                                                                                                    SOk

                                                                                                                    SOkOSSSkK

                                                                                                                    k

                                                                                                                    k

                                                                                                                    k

                                                                                                                    L

                                                                                                                    L

                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                    02468

                                                                                                                    101214161820

                                                                                                                    28Kbps

                                                                                                                    100Kbps

                                                                                                                    1 Mbps 10Mbps

                                                                                                                    non-persistent

                                                                                                                    persistent

                                                                                                                    parallel non-persistent

                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                    HTTP Response time (in seconds)

                                                                                                                    0

                                                                                                                    10

                                                                                                                    20

                                                                                                                    30

                                                                                                                    40

                                                                                                                    50

                                                                                                                    60

                                                                                                                    70

                                                                                                                    28Kbps

                                                                                                                    100Kbps

                                                                                                                    1 Mbps 10Mbps

                                                                                                                    non-persistent

                                                                                                                    persistent

                                                                                                                    parallel non-persistent

                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                    instantiation and implementation in the Internet

                                                                                                                    UDPTCP

                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                    • Chapter 3 outline
                                                                                                                    • Transport services and protocols
                                                                                                                    • Transport vs network layer
                                                                                                                    • Transport-layer protocols
                                                                                                                    • Chapter 3 outline
                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                    • How demultiplexing works
                                                                                                                    • Connectionless demultiplexing
                                                                                                                    • Connectionless demux (cont)
                                                                                                                    • Connection-oriented demux
                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                    • Chapter 3 outline
                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                    • UDP more
                                                                                                                    • UDP checksum
                                                                                                                    • Chapter 3 outline
                                                                                                                    • Principles of Reliable data transfer
                                                                                                                    • Reliable data transfer getting started
                                                                                                                    • Reliable data transfer getting started
                                                                                                                    • Incremental Improvements
                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                    • rdt20 FSM specification
                                                                                                                    • rdt20 operation with no errors
                                                                                                                    • rdt20 error scenario
                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                    • rdt21 discussion
                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                    • rdt30 sender
                                                                                                                    • rdt30 in action
                                                                                                                    • rdt30 in action
                                                                                                                    • Performance of rdt30
                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                    • Pipelined protocols
                                                                                                                    • Pipelined protocols
                                                                                                                    • Pipelining increased utilization
                                                                                                                    • Go-Back-N
                                                                                                                    • GBN Sender
                                                                                                                    • GBN sender extended FSM
                                                                                                                    • GBN receiver extended FSM
                                                                                                                    • More on receiver
                                                                                                                    • GBN inaction
                                                                                                                    • Selective Repeat
                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                    • Selective repeat
                                                                                                                    • Selective repeat in action
                                                                                                                    • Selective repeat dilemma
                                                                                                                    • Chapter 3 outline
                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                    • More TCP Details
                                                                                                                    • Even More TCP Details
                                                                                                                    • TCP segment structure
                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                    • Example RTT estimation
                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                    • Chapter 3 outline
                                                                                                                    • TCP reliable data transfer
                                                                                                                    • TCP sender events
                                                                                                                    • TCP sender(simplified)
                                                                                                                    • TCP retransmission scenarios
                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                    • More on Sender Policies
                                                                                                                    • Fast Retransmit
                                                                                                                    • Fast retransmit algorithm
                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                    • Chapter 3 outline
                                                                                                                    • TCP Flow Control
                                                                                                                    • TCP Flow Control
                                                                                                                    • TCP segment structure
                                                                                                                    • TCP Flow control how it works
                                                                                                                    • Technical Issue
                                                                                                                    • Chapter 3 outline
                                                                                                                    • TCP Connection Management
                                                                                                                    • TCP Connection Management (cont)
                                                                                                                    • TCP Connection Management (cont)
                                                                                                                    • TCP Connection Management (cont)
                                                                                                                    • TCP Connection Management (cont)
                                                                                                                    • A few special cases
                                                                                                                    • Chapter 3 outline
                                                                                                                    • Principles of Congestion Control
                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                    • Approaches towards congestion control
                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                    • Chapter 3 outline
                                                                                                                    • TCP Congestion Control
                                                                                                                    • TCP AIMD
                                                                                                                    • TCP Slow Start
                                                                                                                    • TCP Slow Start (more)
                                                                                                                    • Summary TCP Congestion Control
                                                                                                                    • The Big Picture
                                                                                                                    • TCP sender congestion control
                                                                                                                    • TCP throughput
                                                                                                                    • TCP Futures
                                                                                                                    • TCP Fairness
                                                                                                                    • Why is TCP fair
                                                                                                                    • Fairness (more)
                                                                                                                    • TCP Latency Modeling
                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                    • Fixed congestion window (1)
                                                                                                                    • Fixed congestion window (2)
                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                    • HTTP Modeling
                                                                                                                    • Chapter 3 Summary

                                                                                                                      3 Transport Layer 59Comp 361 Spring 2005

                                                                                                                      More TCP DetailsMaximum Segment Size (MSS)

                                                                                                                      Depends upon implementation (can often be set)The Max amount of application-layer data in segment

                                                                                                                      Application Data + TCP Header = TCP Segment

                                                                                                                      Three way HandshakeClient sends special TCP segment to server requesting connection No payload (Application data) in this segmentServer responds with second special TCP segment

                                                                                                                      (again no payload)Client responds with third special segment

                                                                                                                      This can contain payload

                                                                                                                      3 Transport Layer 60Comp 361 Spring 2005

                                                                                                                      Even More TCP Details

                                                                                                                      A TCP connection between client and server creates in both client and server

                                                                                                                      (i) buffers(ii) variables and

                                                                                                                      (iii) a socket connection to process

                                                                                                                      TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                                      any of the network elements between the host and server

                                                                                                                      3 Transport Layer 61Comp 361 Spring 2005

                                                                                                                      TCP segment structure

                                                                                                                      source port dest port

                                                                                                                      32 bits

                                                                                                                      applicationdata

                                                                                                                      (variable length)

                                                                                                                      sequence numberacknowledgement number

                                                                                                                      Receive windowUrg data pnterchecksum

                                                                                                                      FSRPAUheadlen

                                                                                                                      notused

                                                                                                                      Options (variable length)

                                                                                                                      URG urgent data (generally not used)

                                                                                                                      ACK ACK valid

                                                                                                                      PSH push data now(generally not used)

                                                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                                                      commands)

                                                                                                                      bytes rcvr willingto accept

                                                                                                                      Internetchecksum

                                                                                                                      (as in UDP)

                                                                                                                      countingby bytes of data(not segments)

                                                                                                                      3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                      TCP seq rsquos and ACKsSeq rsquos

                                                                                                                      byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                      ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                      Q how receiver handles out-of-order segments

                                                                                                                      A TCP spec doesnrsquot say - up to implementer

                                                                                                                      Host BHost A

                                                                                                                      Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                      Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                      Seq=43 ACK=80

                                                                                                                      Usertypes

                                                                                                                      lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                      back lsquoCrsquo

                                                                                                                      host ACKsreceipt

                                                                                                                      of echoedlsquoCrsquo

                                                                                                                      timesimple telnet scenario

                                                                                                                      3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                      TCP Round Trip Time and Timeout

                                                                                                                      Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                      ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                      average several recent measurements not just current SampleRTT

                                                                                                                      Q how to set TCP timeout valuelonger than RTT

                                                                                                                      but RTT variestoo short premature timeout

                                                                                                                      unnecessary retransmissions

                                                                                                                      too long slow reaction to segment loss

                                                                                                                      3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                      TCP Round Trip Time and Timeout

                                                                                                                      EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                      Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                      3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                      Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                      100

                                                                                                                      150

                                                                                                                      200

                                                                                                                      250

                                                                                                                      300

                                                                                                                      350

                                                                                                                      1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                      time (seconnds)

                                                                                                                      RTT

                                                                                                                      (mill

                                                                                                                      iseco

                                                                                                                      nds)

                                                                                                                      SampleRTT Estimated RTT

                                                                                                                      3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                      TCP Round Trip Time and Timeout

                                                                                                                      Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                      large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                      DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                      (typically β = 025)

                                                                                                                      Then set timeout interval

                                                                                                                      TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                      3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                      Chapter 3 outline

                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                      3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                      TCP reliable data transfer

                                                                                                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                      Retransmissions are triggered by

                                                                                                                      timeout eventsduplicate acks

                                                                                                                      Initially consider simplified TCP sender

                                                                                                                      ignore duplicate acksignore flow control congestion control

                                                                                                                      3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                      TCP sender eventsdata rcvd from app

                                                                                                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                      timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                      Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                      update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                      TCP sender(simplified)

                                                                                                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                      loop (forever) switch(event)

                                                                                                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                      smallest sequence numberstart timer

                                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                      start timer

                                                                                                                      end of loop forever

                                                                                                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                      3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                      3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                      TCP retransmission scenariosHost A

                                                                                                                      Seq=100 20 bytes data

                                                                                                                      ACK=100

                                                                                                                      timepremature timeout

                                                                                                                      Host B

                                                                                                                      Seq=92 8 bytes data

                                                                                                                      ACK=120

                                                                                                                      Seq=92 8 bytes data

                                                                                                                      Seq=

                                                                                                                      92 t

                                                                                                                      imeo

                                                                                                                      ut

                                                                                                                      ACK=120

                                                                                                                      Host A

                                                                                                                      Seq=92 8 bytes data

                                                                                                                      ACK=100

                                                                                                                      loss

                                                                                                                      tim

                                                                                                                      eout

                                                                                                                      lost ACK scenario

                                                                                                                      Host B

                                                                                                                      X

                                                                                                                      Seq=92 8 bytes data

                                                                                                                      ACK=100

                                                                                                                      time

                                                                                                                      SendBase= 120

                                                                                                                      SendBase= 120

                                                                                                                      Sendbase= 100

                                                                                                                      Seq=

                                                                                                                      92 t

                                                                                                                      imeo

                                                                                                                      utSendBase

                                                                                                                      = 100

                                                                                                                      3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                      TCP retransmission scenarios (more)Host A

                                                                                                                      Seq=92 8 bytes data

                                                                                                                      ACK=100

                                                                                                                      loss

                                                                                                                      tim

                                                                                                                      eout

                                                                                                                      Cumulative ACK scenario

                                                                                                                      Host B

                                                                                                                      X

                                                                                                                      Seq=100 20 bytes data

                                                                                                                      ACK=120

                                                                                                                      time

                                                                                                                      SendBase= 120

                                                                                                                      3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                      TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                      Event at Receiver

                                                                                                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                      Arrival of segment that partially or completely fills gap

                                                                                                                      TCP Receiver action

                                                                                                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                      Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                      Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                      3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                      More on Sender Policies

                                                                                                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                      Fast Retransmit

                                                                                                                      Time-out period often relatively long

                                                                                                                      long delay before resending lost packet

                                                                                                                      Detect lost segments via duplicate ACKs

                                                                                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                      fast retransmit resend segment before timer expires

                                                                                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                      Fast retransmit algorithm

                                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                      start timer

                                                                                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                      resend segment with sequence number y

                                                                                                                      a duplicate ACK for already ACKed segment

                                                                                                                      fast retransmit

                                                                                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                      TCP GBN or Selective Repeat

                                                                                                                      Basic TCP looks a lot like GBN

                                                                                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                      This looks a lot like Selective Repeat

                                                                                                                      TCP is a hybrid

                                                                                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                      Chapter 3 outline

                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                      TCP Flow Control

                                                                                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                      transmitting too muchtoo fast

                                                                                                                      flow controlreceive side of TCP connection has a receive buffer

                                                                                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                      app process may be slow at reading from buffer

                                                                                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                      TCP segment structure

                                                                                                                      source port dest port

                                                                                                                      32 bits

                                                                                                                      applicationdata

                                                                                                                      (variable length)

                                                                                                                      sequence numberacknowledgement number

                                                                                                                      Receive windowUrg data pnterchecksum

                                                                                                                      FSRPAUheadlen

                                                                                                                      notused

                                                                                                                      Options (variable length)

                                                                                                                      URG urgent data (generally not used)

                                                                                                                      ACK ACK valid

                                                                                                                      PSH push data now(generally not used)

                                                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                                                      commands)

                                                                                                                      bytes rcvr willingto accept

                                                                                                                      Internetchecksum

                                                                                                                      (as in UDP)

                                                                                                                      countingby bytes of data(not segments)

                                                                                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                      TCP Flow control how it works

                                                                                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                      LastByteRead]

                                                                                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                      guarantees receive buffer doesnrsquot overflow

                                                                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                      Technical Issue

                                                                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                      Note on UDP

                                                                                                                      UDP has no flow control

                                                                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                      Chapter 3 outline

                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                      TCP Connection Management

                                                                                                                      Three way handshakeStep 1 client end system sends

                                                                                                                      TCP SYN control segment to server

                                                                                                                      specifies client_isn the initial seq No application data

                                                                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                      TCP Connection Management (cont)

                                                                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                      Allocate buffersAllocates buffersCan include application data

                                                                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                                                                      server

                                                                                                                      Connection granted (SYN=1 server_isn

                                                                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                                                                      ack=client_isn+1)

                                                                                                                      ack=server_isn+1

                                                                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                      TCP Connection Management (cont)

                                                                                                                      Closing a connection

                                                                                                                      client closes socketclientSocketclose()

                                                                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                      client

                                                                                                                      FIN

                                                                                                                      server

                                                                                                                      ACK

                                                                                                                      ACK

                                                                                                                      FIN

                                                                                                                      close

                                                                                                                      close

                                                                                                                      closed

                                                                                                                      tim

                                                                                                                      ed w

                                                                                                                      ait

                                                                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                      TCP Connection Management (cont)

                                                                                                                      Step 3 client receives FIN replies with ACK

                                                                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                      Closes down after timed-wait

                                                                                                                      Step 4 server receives ACK Connection closed

                                                                                                                      Note with small modification can handle simultaneous FINs

                                                                                                                      client

                                                                                                                      FIN

                                                                                                                      server

                                                                                                                      ACK

                                                                                                                      ACK

                                                                                                                      FIN

                                                                                                                      closing

                                                                                                                      closing

                                                                                                                      closed

                                                                                                                      tim

                                                                                                                      ed w

                                                                                                                      ait

                                                                                                                      closed

                                                                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                      TCP Connection Management (cont)

                                                                                                                      ExampleTCP serverlifecycle

                                                                                                                      Example TCP clientlifecycle

                                                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                      A few special cases

                                                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                      Chapter 3 outline

                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                      Principles of Congestion Control

                                                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                      a top-10 problem

                                                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                      large delays when congestedmaximum achievable throughput

                                                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                      Causescosts of congestion scenario 2

                                                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                      λin λout=

                                                                                                                      λin λoutgtλ

                                                                                                                      inλout

                                                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                      (c)(a) (b)

                                                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                      λin

                                                                                                                      Q what happens as and increase λ

                                                                                                                      in

                                                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                      Causescosts of congestion scenario 3

                                                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                      Approaches towards congestion control

                                                                                                                      Two broad approaches towards congestion control

                                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                      Case study ATM ABR congestion control

                                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                                      small exception ndash see next page

                                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                      sender should use available bandwidth

                                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                      Case study ATM ABR congestion control

                                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                      Chapter 3 outline

                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                      Congwin

                                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                      cut CongWin in half after loss event

                                                                                                                      8 Kbytes

                                                                                                                      16 Kbytes

                                                                                                                      24 Kbytes

                                                                                                                      time

                                                                                                                      congestionwindow

                                                                                                                      Long-lived TCP connection

                                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                      TCP Slow Start

                                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                      TCP Slow Start (more)

                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                      Host A

                                                                                                                      one segment

                                                                                                                      RTT

                                                                                                                      Host B

                                                                                                                      time

                                                                                                                      two segments

                                                                                                                      four segments

                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                      Summary TCP Congestion Control

                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                      The Big Picture

                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                      Slow Start (SS)

                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                      CongestionAvoidance (CA)

                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                      Enter slow start

                                                                                                                      Duplicate ACK

                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                      CongWin and Threshold not changed

                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                      TCP throughput

                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                      TCP Futures

                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                      LRTTMSSsdot221

                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                      TCP connection 1

                                                                                                                      bottleneckrouter

                                                                                                                      capacity R

                                                                                                                      TCP connection 2

                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                      R

                                                                                                                      R

                                                                                                                      equal bandwidth share

                                                                                                                      Connection 1 throughput

                                                                                                                      Conn

                                                                                                                      ecti

                                                                                                                      on 2

                                                                                                                      thr

                                                                                                                      ough

                                                                                                                      p ut

                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                      do not want rate throttled by congestion control

                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                      modeling slow start

                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                      Fixed congestion window (1)

                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                      latency = 2RTT + OR

                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                      Fixed congestion window (2)

                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                      Will show that the delay for one object is

                                                                                                                      RS

                                                                                                                      RSRTTP

                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                      RTT

                                                                                                                      initiate TCPconnection

                                                                                                                      requestobject

                                                                                                                      first window= SR

                                                                                                                      second window= 2SR

                                                                                                                      third window= 4SR

                                                                                                                      fourth window= 8SR

                                                                                                                      completetransmissionobject

                                                                                                                      delivered

                                                                                                                      time atclient

                                                                                                                      time atserver

                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                      Server idles P=2 times

                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                      Server idles P = minK-1Q times

                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                      TCP Latency Modeling (3)

                                                                                                                      ementacknowledg receivesserver until

                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                      RS

                                                                                                                      RSRTTPRTT

                                                                                                                      RO

                                                                                                                      RSRTT

                                                                                                                      RSRTT

                                                                                                                      RO

                                                                                                                      idleTimeRTTRO

                                                                                                                      P

                                                                                                                      kP

                                                                                                                      k

                                                                                                                      P

                                                                                                                      pp

                                                                                                                      )12(][2

                                                                                                                      ]2[2

                                                                                                                      2delay

                                                                                                                      1

                                                                                                                      1

                                                                                                                      1

                                                                                                                      minusminus+++=

                                                                                                                      minus+++=

                                                                                                                      ++=

                                                                                                                      minus

                                                                                                                      =

                                                                                                                      =

                                                                                                                      sum

                                                                                                                      sum

                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                      RS k =⎥⎦

                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                      +minus

                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                      RSk

                                                                                                                      RTT

                                                                                                                      initiate TCPconnection

                                                                                                                      requestobject

                                                                                                                      first window= SR

                                                                                                                      second window= 2SR

                                                                                                                      third window= 4SR

                                                                                                                      fourth window= 8SR

                                                                                                                      completetransmissionobject

                                                                                                                      delivered

                                                                                                                      time atclient

                                                                                                                      time atserver

                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                      How do we calculate K

                                                                                                                      ⎥⎥⎤

                                                                                                                      ⎢⎢⎡ +=

                                                                                                                      +ge=

                                                                                                                      geminus=

                                                                                                                      ge+++=

                                                                                                                      ge+++=minus

                                                                                                                      minus

                                                                                                                      )1(log

                                                                                                                      )1(logmin

                                                                                                                      12min

                                                                                                                      222min222min

                                                                                                                      2

                                                                                                                      2

                                                                                                                      110

                                                                                                                      110

                                                                                                                      SO

                                                                                                                      SOkk

                                                                                                                      SOk

                                                                                                                      SOkOSSSkK

                                                                                                                      k

                                                                                                                      k

                                                                                                                      k

                                                                                                                      L

                                                                                                                      L

                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                      02468

                                                                                                                      101214161820

                                                                                                                      28Kbps

                                                                                                                      100Kbps

                                                                                                                      1 Mbps 10Mbps

                                                                                                                      non-persistent

                                                                                                                      persistent

                                                                                                                      parallel non-persistent

                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                      HTTP Response time (in seconds)

                                                                                                                      0

                                                                                                                      10

                                                                                                                      20

                                                                                                                      30

                                                                                                                      40

                                                                                                                      50

                                                                                                                      60

                                                                                                                      70

                                                                                                                      28Kbps

                                                                                                                      100Kbps

                                                                                                                      1 Mbps 10Mbps

                                                                                                                      non-persistent

                                                                                                                      persistent

                                                                                                                      parallel non-persistent

                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                      instantiation and implementation in the Internet

                                                                                                                      UDPTCP

                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                      • Chapter 3 outline
                                                                                                                      • Transport services and protocols
                                                                                                                      • Transport vs network layer
                                                                                                                      • Transport-layer protocols
                                                                                                                      • Chapter 3 outline
                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                      • How demultiplexing works
                                                                                                                      • Connectionless demultiplexing
                                                                                                                      • Connectionless demux (cont)
                                                                                                                      • Connection-oriented demux
                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                      • Chapter 3 outline
                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                      • UDP more
                                                                                                                      • UDP checksum
                                                                                                                      • Chapter 3 outline
                                                                                                                      • Principles of Reliable data transfer
                                                                                                                      • Reliable data transfer getting started
                                                                                                                      • Reliable data transfer getting started
                                                                                                                      • Incremental Improvements
                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                      • rdt20 FSM specification
                                                                                                                      • rdt20 operation with no errors
                                                                                                                      • rdt20 error scenario
                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                      • rdt21 discussion
                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                      • rdt30 sender
                                                                                                                      • rdt30 in action
                                                                                                                      • rdt30 in action
                                                                                                                      • Performance of rdt30
                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                      • Pipelined protocols
                                                                                                                      • Pipelined protocols
                                                                                                                      • Pipelining increased utilization
                                                                                                                      • Go-Back-N
                                                                                                                      • GBN Sender
                                                                                                                      • GBN sender extended FSM
                                                                                                                      • GBN receiver extended FSM
                                                                                                                      • More on receiver
                                                                                                                      • GBN inaction
                                                                                                                      • Selective Repeat
                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                      • Selective repeat
                                                                                                                      • Selective repeat in action
                                                                                                                      • Selective repeat dilemma
                                                                                                                      • Chapter 3 outline
                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                      • More TCP Details
                                                                                                                      • Even More TCP Details
                                                                                                                      • TCP segment structure
                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                      • Example RTT estimation
                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                      • Chapter 3 outline
                                                                                                                      • TCP reliable data transfer
                                                                                                                      • TCP sender events
                                                                                                                      • TCP sender(simplified)
                                                                                                                      • TCP retransmission scenarios
                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                      • More on Sender Policies
                                                                                                                      • Fast Retransmit
                                                                                                                      • Fast retransmit algorithm
                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                      • Chapter 3 outline
                                                                                                                      • TCP Flow Control
                                                                                                                      • TCP Flow Control
                                                                                                                      • TCP segment structure
                                                                                                                      • TCP Flow control how it works
                                                                                                                      • Technical Issue
                                                                                                                      • Chapter 3 outline
                                                                                                                      • TCP Connection Management
                                                                                                                      • TCP Connection Management (cont)
                                                                                                                      • TCP Connection Management (cont)
                                                                                                                      • TCP Connection Management (cont)
                                                                                                                      • TCP Connection Management (cont)
                                                                                                                      • A few special cases
                                                                                                                      • Chapter 3 outline
                                                                                                                      • Principles of Congestion Control
                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                      • Approaches towards congestion control
                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                      • Chapter 3 outline
                                                                                                                      • TCP Congestion Control
                                                                                                                      • TCP AIMD
                                                                                                                      • TCP Slow Start
                                                                                                                      • TCP Slow Start (more)
                                                                                                                      • Summary TCP Congestion Control
                                                                                                                      • The Big Picture
                                                                                                                      • TCP sender congestion control
                                                                                                                      • TCP throughput
                                                                                                                      • TCP Futures
                                                                                                                      • TCP Fairness
                                                                                                                      • Why is TCP fair
                                                                                                                      • Fairness (more)
                                                                                                                      • TCP Latency Modeling
                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                      • Fixed congestion window (1)
                                                                                                                      • Fixed congestion window (2)
                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                      • HTTP Modeling
                                                                                                                      • Chapter 3 Summary

                                                                                                                        3 Transport Layer 60Comp 361 Spring 2005

                                                                                                                        Even More TCP Details

                                                                                                                        A TCP connection between client and server creates in both client and server

                                                                                                                        (i) buffers(ii) variables and

                                                                                                                        (iii) a socket connection to process

                                                                                                                        TCP only exists in the two end machinesNo buffers and variables allocated to the connection in

                                                                                                                        any of the network elements between the host and server

                                                                                                                        3 Transport Layer 61Comp 361 Spring 2005

                                                                                                                        TCP segment structure

                                                                                                                        source port dest port

                                                                                                                        32 bits

                                                                                                                        applicationdata

                                                                                                                        (variable length)

                                                                                                                        sequence numberacknowledgement number

                                                                                                                        Receive windowUrg data pnterchecksum

                                                                                                                        FSRPAUheadlen

                                                                                                                        notused

                                                                                                                        Options (variable length)

                                                                                                                        URG urgent data (generally not used)

                                                                                                                        ACK ACK valid

                                                                                                                        PSH push data now(generally not used)

                                                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                                                        commands)

                                                                                                                        bytes rcvr willingto accept

                                                                                                                        Internetchecksum

                                                                                                                        (as in UDP)

                                                                                                                        countingby bytes of data(not segments)

                                                                                                                        3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                        TCP seq rsquos and ACKsSeq rsquos

                                                                                                                        byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                        ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                        Q how receiver handles out-of-order segments

                                                                                                                        A TCP spec doesnrsquot say - up to implementer

                                                                                                                        Host BHost A

                                                                                                                        Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                        Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                        Seq=43 ACK=80

                                                                                                                        Usertypes

                                                                                                                        lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                        back lsquoCrsquo

                                                                                                                        host ACKsreceipt

                                                                                                                        of echoedlsquoCrsquo

                                                                                                                        timesimple telnet scenario

                                                                                                                        3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                        TCP Round Trip Time and Timeout

                                                                                                                        Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                        ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                        average several recent measurements not just current SampleRTT

                                                                                                                        Q how to set TCP timeout valuelonger than RTT

                                                                                                                        but RTT variestoo short premature timeout

                                                                                                                        unnecessary retransmissions

                                                                                                                        too long slow reaction to segment loss

                                                                                                                        3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                        TCP Round Trip Time and Timeout

                                                                                                                        EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                        Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                        3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                        Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                        100

                                                                                                                        150

                                                                                                                        200

                                                                                                                        250

                                                                                                                        300

                                                                                                                        350

                                                                                                                        1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                        time (seconnds)

                                                                                                                        RTT

                                                                                                                        (mill

                                                                                                                        iseco

                                                                                                                        nds)

                                                                                                                        SampleRTT Estimated RTT

                                                                                                                        3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                        TCP Round Trip Time and Timeout

                                                                                                                        Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                        large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                        DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                        (typically β = 025)

                                                                                                                        Then set timeout interval

                                                                                                                        TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                        3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                        Chapter 3 outline

                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                        3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                        TCP reliable data transfer

                                                                                                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                        Retransmissions are triggered by

                                                                                                                        timeout eventsduplicate acks

                                                                                                                        Initially consider simplified TCP sender

                                                                                                                        ignore duplicate acksignore flow control congestion control

                                                                                                                        3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                        TCP sender eventsdata rcvd from app

                                                                                                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                        timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                        Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                        update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                        TCP sender(simplified)

                                                                                                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                        loop (forever) switch(event)

                                                                                                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                        smallest sequence numberstart timer

                                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                        start timer

                                                                                                                        end of loop forever

                                                                                                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                        3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                        3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                        TCP retransmission scenariosHost A

                                                                                                                        Seq=100 20 bytes data

                                                                                                                        ACK=100

                                                                                                                        timepremature timeout

                                                                                                                        Host B

                                                                                                                        Seq=92 8 bytes data

                                                                                                                        ACK=120

                                                                                                                        Seq=92 8 bytes data

                                                                                                                        Seq=

                                                                                                                        92 t

                                                                                                                        imeo

                                                                                                                        ut

                                                                                                                        ACK=120

                                                                                                                        Host A

                                                                                                                        Seq=92 8 bytes data

                                                                                                                        ACK=100

                                                                                                                        loss

                                                                                                                        tim

                                                                                                                        eout

                                                                                                                        lost ACK scenario

                                                                                                                        Host B

                                                                                                                        X

                                                                                                                        Seq=92 8 bytes data

                                                                                                                        ACK=100

                                                                                                                        time

                                                                                                                        SendBase= 120

                                                                                                                        SendBase= 120

                                                                                                                        Sendbase= 100

                                                                                                                        Seq=

                                                                                                                        92 t

                                                                                                                        imeo

                                                                                                                        utSendBase

                                                                                                                        = 100

                                                                                                                        3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                        TCP retransmission scenarios (more)Host A

                                                                                                                        Seq=92 8 bytes data

                                                                                                                        ACK=100

                                                                                                                        loss

                                                                                                                        tim

                                                                                                                        eout

                                                                                                                        Cumulative ACK scenario

                                                                                                                        Host B

                                                                                                                        X

                                                                                                                        Seq=100 20 bytes data

                                                                                                                        ACK=120

                                                                                                                        time

                                                                                                                        SendBase= 120

                                                                                                                        3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                        TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                        Event at Receiver

                                                                                                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                        Arrival of segment that partially or completely fills gap

                                                                                                                        TCP Receiver action

                                                                                                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                        Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                        Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                        3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                        More on Sender Policies

                                                                                                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                        3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                        Fast Retransmit

                                                                                                                        Time-out period often relatively long

                                                                                                                        long delay before resending lost packet

                                                                                                                        Detect lost segments via duplicate ACKs

                                                                                                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                        fast retransmit resend segment before timer expires

                                                                                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                        Fast retransmit algorithm

                                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                        start timer

                                                                                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                        resend segment with sequence number y

                                                                                                                        a duplicate ACK for already ACKed segment

                                                                                                                        fast retransmit

                                                                                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                        TCP GBN or Selective Repeat

                                                                                                                        Basic TCP looks a lot like GBN

                                                                                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                        This looks a lot like Selective Repeat

                                                                                                                        TCP is a hybrid

                                                                                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                        Chapter 3 outline

                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                        TCP Flow Control

                                                                                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                        transmitting too muchtoo fast

                                                                                                                        flow controlreceive side of TCP connection has a receive buffer

                                                                                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                        app process may be slow at reading from buffer

                                                                                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                        TCP segment structure

                                                                                                                        source port dest port

                                                                                                                        32 bits

                                                                                                                        applicationdata

                                                                                                                        (variable length)

                                                                                                                        sequence numberacknowledgement number

                                                                                                                        Receive windowUrg data pnterchecksum

                                                                                                                        FSRPAUheadlen

                                                                                                                        notused

                                                                                                                        Options (variable length)

                                                                                                                        URG urgent data (generally not used)

                                                                                                                        ACK ACK valid

                                                                                                                        PSH push data now(generally not used)

                                                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                                                        commands)

                                                                                                                        bytes rcvr willingto accept

                                                                                                                        Internetchecksum

                                                                                                                        (as in UDP)

                                                                                                                        countingby bytes of data(not segments)

                                                                                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                        TCP Flow control how it works

                                                                                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                        LastByteRead]

                                                                                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                        guarantees receive buffer doesnrsquot overflow

                                                                                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                        Technical Issue

                                                                                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                        Note on UDP

                                                                                                                        UDP has no flow control

                                                                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                        Chapter 3 outline

                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                        TCP Connection Management

                                                                                                                        Three way handshakeStep 1 client end system sends

                                                                                                                        TCP SYN control segment to server

                                                                                                                        specifies client_isn the initial seq No application data

                                                                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                        TCP Connection Management (cont)

                                                                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                        Allocate buffersAllocates buffersCan include application data

                                                                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                                                                        server

                                                                                                                        Connection granted (SYN=1 server_isn

                                                                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                                                                        ack=client_isn+1)

                                                                                                                        ack=server_isn+1

                                                                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                        TCP Connection Management (cont)

                                                                                                                        Closing a connection

                                                                                                                        client closes socketclientSocketclose()

                                                                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                        client

                                                                                                                        FIN

                                                                                                                        server

                                                                                                                        ACK

                                                                                                                        ACK

                                                                                                                        FIN

                                                                                                                        close

                                                                                                                        close

                                                                                                                        closed

                                                                                                                        tim

                                                                                                                        ed w

                                                                                                                        ait

                                                                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                        TCP Connection Management (cont)

                                                                                                                        Step 3 client receives FIN replies with ACK

                                                                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                        Closes down after timed-wait

                                                                                                                        Step 4 server receives ACK Connection closed

                                                                                                                        Note with small modification can handle simultaneous FINs

                                                                                                                        client

                                                                                                                        FIN

                                                                                                                        server

                                                                                                                        ACK

                                                                                                                        ACK

                                                                                                                        FIN

                                                                                                                        closing

                                                                                                                        closing

                                                                                                                        closed

                                                                                                                        tim

                                                                                                                        ed w

                                                                                                                        ait

                                                                                                                        closed

                                                                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                        TCP Connection Management (cont)

                                                                                                                        ExampleTCP serverlifecycle

                                                                                                                        Example TCP clientlifecycle

                                                                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                        A few special cases

                                                                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                        Chapter 3 outline

                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                        Principles of Congestion Control

                                                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                        a top-10 problem

                                                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                        large delays when congestedmaximum achievable throughput

                                                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                        Causescosts of congestion scenario 2

                                                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                        λin λout=

                                                                                                                        λin λoutgtλ

                                                                                                                        inλout

                                                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                        (c)(a) (b)

                                                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                        λin

                                                                                                                        Q what happens as and increase λ

                                                                                                                        in

                                                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                        Causescosts of congestion scenario 3

                                                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                        Approaches towards congestion control

                                                                                                                        Two broad approaches towards congestion control

                                                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                        Case study ATM ABR congestion control

                                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                                        small exception ndash see next page

                                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                        sender should use available bandwidth

                                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                        Case study ATM ABR congestion control

                                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                        Chapter 3 outline

                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                        Congwin

                                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                        cut CongWin in half after loss event

                                                                                                                        8 Kbytes

                                                                                                                        16 Kbytes

                                                                                                                        24 Kbytes

                                                                                                                        time

                                                                                                                        congestionwindow

                                                                                                                        Long-lived TCP connection

                                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                        TCP Slow Start

                                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                        TCP Slow Start (more)

                                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                                        Host A

                                                                                                                        one segment

                                                                                                                        RTT

                                                                                                                        Host B

                                                                                                                        time

                                                                                                                        two segments

                                                                                                                        four segments

                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                        Summary TCP Congestion Control

                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                        The Big Picture

                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                        Slow Start (SS)

                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                        CongestionAvoidance (CA)

                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                        Enter slow start

                                                                                                                        Duplicate ACK

                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                        CongWin and Threshold not changed

                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                        TCP throughput

                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                        TCP Futures

                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                        LRTTMSSsdot221

                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                        TCP connection 1

                                                                                                                        bottleneckrouter

                                                                                                                        capacity R

                                                                                                                        TCP connection 2

                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                        R

                                                                                                                        R

                                                                                                                        equal bandwidth share

                                                                                                                        Connection 1 throughput

                                                                                                                        Conn

                                                                                                                        ecti

                                                                                                                        on 2

                                                                                                                        thr

                                                                                                                        ough

                                                                                                                        p ut

                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                        do not want rate throttled by congestion control

                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                        modeling slow start

                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                        Fixed congestion window (1)

                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                        latency = 2RTT + OR

                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                        Fixed congestion window (2)

                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                        Will show that the delay for one object is

                                                                                                                        RS

                                                                                                                        RSRTTP

                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                        RTT

                                                                                                                        initiate TCPconnection

                                                                                                                        requestobject

                                                                                                                        first window= SR

                                                                                                                        second window= 2SR

                                                                                                                        third window= 4SR

                                                                                                                        fourth window= 8SR

                                                                                                                        completetransmissionobject

                                                                                                                        delivered

                                                                                                                        time atclient

                                                                                                                        time atserver

                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                        Server idles P=2 times

                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                        Server idles P = minK-1Q times

                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                        TCP Latency Modeling (3)

                                                                                                                        ementacknowledg receivesserver until

                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                        RS

                                                                                                                        RSRTTPRTT

                                                                                                                        RO

                                                                                                                        RSRTT

                                                                                                                        RSRTT

                                                                                                                        RO

                                                                                                                        idleTimeRTTRO

                                                                                                                        P

                                                                                                                        kP

                                                                                                                        k

                                                                                                                        P

                                                                                                                        pp

                                                                                                                        )12(][2

                                                                                                                        ]2[2

                                                                                                                        2delay

                                                                                                                        1

                                                                                                                        1

                                                                                                                        1

                                                                                                                        minusminus+++=

                                                                                                                        minus+++=

                                                                                                                        ++=

                                                                                                                        minus

                                                                                                                        =

                                                                                                                        =

                                                                                                                        sum

                                                                                                                        sum

                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                        RS k =⎥⎦

                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                        +minus

                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                        RSk

                                                                                                                        RTT

                                                                                                                        initiate TCPconnection

                                                                                                                        requestobject

                                                                                                                        first window= SR

                                                                                                                        second window= 2SR

                                                                                                                        third window= 4SR

                                                                                                                        fourth window= 8SR

                                                                                                                        completetransmissionobject

                                                                                                                        delivered

                                                                                                                        time atclient

                                                                                                                        time atserver

                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                        How do we calculate K

                                                                                                                        ⎥⎥⎤

                                                                                                                        ⎢⎢⎡ +=

                                                                                                                        +ge=

                                                                                                                        geminus=

                                                                                                                        ge+++=

                                                                                                                        ge+++=minus

                                                                                                                        minus

                                                                                                                        )1(log

                                                                                                                        )1(logmin

                                                                                                                        12min

                                                                                                                        222min222min

                                                                                                                        2

                                                                                                                        2

                                                                                                                        110

                                                                                                                        110

                                                                                                                        SO

                                                                                                                        SOkk

                                                                                                                        SOk

                                                                                                                        SOkOSSSkK

                                                                                                                        k

                                                                                                                        k

                                                                                                                        k

                                                                                                                        L

                                                                                                                        L

                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                        02468

                                                                                                                        101214161820

                                                                                                                        28Kbps

                                                                                                                        100Kbps

                                                                                                                        1 Mbps 10Mbps

                                                                                                                        non-persistent

                                                                                                                        persistent

                                                                                                                        parallel non-persistent

                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                        HTTP Response time (in seconds)

                                                                                                                        0

                                                                                                                        10

                                                                                                                        20

                                                                                                                        30

                                                                                                                        40

                                                                                                                        50

                                                                                                                        60

                                                                                                                        70

                                                                                                                        28Kbps

                                                                                                                        100Kbps

                                                                                                                        1 Mbps 10Mbps

                                                                                                                        non-persistent

                                                                                                                        persistent

                                                                                                                        parallel non-persistent

                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                        instantiation and implementation in the Internet

                                                                                                                        UDPTCP

                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                        • Chapter 3 outline
                                                                                                                        • Transport services and protocols
                                                                                                                        • Transport vs network layer
                                                                                                                        • Transport-layer protocols
                                                                                                                        • Chapter 3 outline
                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                        • How demultiplexing works
                                                                                                                        • Connectionless demultiplexing
                                                                                                                        • Connectionless demux (cont)
                                                                                                                        • Connection-oriented demux
                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                        • Chapter 3 outline
                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                        • UDP more
                                                                                                                        • UDP checksum
                                                                                                                        • Chapter 3 outline
                                                                                                                        • Principles of Reliable data transfer
                                                                                                                        • Reliable data transfer getting started
                                                                                                                        • Reliable data transfer getting started
                                                                                                                        • Incremental Improvements
                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                        • rdt20 FSM specification
                                                                                                                        • rdt20 operation with no errors
                                                                                                                        • rdt20 error scenario
                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                        • rdt21 discussion
                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                        • rdt30 sender
                                                                                                                        • rdt30 in action
                                                                                                                        • rdt30 in action
                                                                                                                        • Performance of rdt30
                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                        • Pipelined protocols
                                                                                                                        • Pipelined protocols
                                                                                                                        • Pipelining increased utilization
                                                                                                                        • Go-Back-N
                                                                                                                        • GBN Sender
                                                                                                                        • GBN sender extended FSM
                                                                                                                        • GBN receiver extended FSM
                                                                                                                        • More on receiver
                                                                                                                        • GBN inaction
                                                                                                                        • Selective Repeat
                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                        • Selective repeat
                                                                                                                        • Selective repeat in action
                                                                                                                        • Selective repeat dilemma
                                                                                                                        • Chapter 3 outline
                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                        • More TCP Details
                                                                                                                        • Even More TCP Details
                                                                                                                        • TCP segment structure
                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                        • Example RTT estimation
                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                        • Chapter 3 outline
                                                                                                                        • TCP reliable data transfer
                                                                                                                        • TCP sender events
                                                                                                                        • TCP sender(simplified)
                                                                                                                        • TCP retransmission scenarios
                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                        • More on Sender Policies
                                                                                                                        • Fast Retransmit
                                                                                                                        • Fast retransmit algorithm
                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                        • Chapter 3 outline
                                                                                                                        • TCP Flow Control
                                                                                                                        • TCP Flow Control
                                                                                                                        • TCP segment structure
                                                                                                                        • TCP Flow control how it works
                                                                                                                        • Technical Issue
                                                                                                                        • Chapter 3 outline
                                                                                                                        • TCP Connection Management
                                                                                                                        • TCP Connection Management (cont)
                                                                                                                        • TCP Connection Management (cont)
                                                                                                                        • TCP Connection Management (cont)
                                                                                                                        • TCP Connection Management (cont)
                                                                                                                        • A few special cases
                                                                                                                        • Chapter 3 outline
                                                                                                                        • Principles of Congestion Control
                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                        • Approaches towards congestion control
                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                        • Chapter 3 outline
                                                                                                                        • TCP Congestion Control
                                                                                                                        • TCP AIMD
                                                                                                                        • TCP Slow Start
                                                                                                                        • TCP Slow Start (more)
                                                                                                                        • Summary TCP Congestion Control
                                                                                                                        • The Big Picture
                                                                                                                        • TCP sender congestion control
                                                                                                                        • TCP throughput
                                                                                                                        • TCP Futures
                                                                                                                        • TCP Fairness
                                                                                                                        • Why is TCP fair
                                                                                                                        • Fairness (more)
                                                                                                                        • TCP Latency Modeling
                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                        • Fixed congestion window (1)
                                                                                                                        • Fixed congestion window (2)
                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                        • HTTP Modeling
                                                                                                                        • Chapter 3 Summary

                                                                                                                          3 Transport Layer 61Comp 361 Spring 2005

                                                                                                                          TCP segment structure

                                                                                                                          source port dest port

                                                                                                                          32 bits

                                                                                                                          applicationdata

                                                                                                                          (variable length)

                                                                                                                          sequence numberacknowledgement number

                                                                                                                          Receive windowUrg data pnterchecksum

                                                                                                                          FSRPAUheadlen

                                                                                                                          notused

                                                                                                                          Options (variable length)

                                                                                                                          URG urgent data (generally not used)

                                                                                                                          ACK ACK valid

                                                                                                                          PSH push data now(generally not used)

                                                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                                                          commands)

                                                                                                                          bytes rcvr willingto accept

                                                                                                                          Internetchecksum

                                                                                                                          (as in UDP)

                                                                                                                          countingby bytes of data(not segments)

                                                                                                                          3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                          TCP seq rsquos and ACKsSeq rsquos

                                                                                                                          byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                          ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                          Q how receiver handles out-of-order segments

                                                                                                                          A TCP spec doesnrsquot say - up to implementer

                                                                                                                          Host BHost A

                                                                                                                          Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                          Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                          Seq=43 ACK=80

                                                                                                                          Usertypes

                                                                                                                          lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                          back lsquoCrsquo

                                                                                                                          host ACKsreceipt

                                                                                                                          of echoedlsquoCrsquo

                                                                                                                          timesimple telnet scenario

                                                                                                                          3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                          TCP Round Trip Time and Timeout

                                                                                                                          Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                          ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                          average several recent measurements not just current SampleRTT

                                                                                                                          Q how to set TCP timeout valuelonger than RTT

                                                                                                                          but RTT variestoo short premature timeout

                                                                                                                          unnecessary retransmissions

                                                                                                                          too long slow reaction to segment loss

                                                                                                                          3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                          TCP Round Trip Time and Timeout

                                                                                                                          EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                          Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                          3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                          Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                          100

                                                                                                                          150

                                                                                                                          200

                                                                                                                          250

                                                                                                                          300

                                                                                                                          350

                                                                                                                          1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                          time (seconnds)

                                                                                                                          RTT

                                                                                                                          (mill

                                                                                                                          iseco

                                                                                                                          nds)

                                                                                                                          SampleRTT Estimated RTT

                                                                                                                          3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                          TCP Round Trip Time and Timeout

                                                                                                                          Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                          large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                          DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                          (typically β = 025)

                                                                                                                          Then set timeout interval

                                                                                                                          TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                          3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                          Chapter 3 outline

                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                          3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                          TCP reliable data transfer

                                                                                                                          TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                          Retransmissions are triggered by

                                                                                                                          timeout eventsduplicate acks

                                                                                                                          Initially consider simplified TCP sender

                                                                                                                          ignore duplicate acksignore flow control congestion control

                                                                                                                          3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                          TCP sender eventsdata rcvd from app

                                                                                                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                          timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                          Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                          update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                          TCP sender(simplified)

                                                                                                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                          loop (forever) switch(event)

                                                                                                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                          smallest sequence numberstart timer

                                                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                          start timer

                                                                                                                          end of loop forever

                                                                                                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                          3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                          3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                          TCP retransmission scenariosHost A

                                                                                                                          Seq=100 20 bytes data

                                                                                                                          ACK=100

                                                                                                                          timepremature timeout

                                                                                                                          Host B

                                                                                                                          Seq=92 8 bytes data

                                                                                                                          ACK=120

                                                                                                                          Seq=92 8 bytes data

                                                                                                                          Seq=

                                                                                                                          92 t

                                                                                                                          imeo

                                                                                                                          ut

                                                                                                                          ACK=120

                                                                                                                          Host A

                                                                                                                          Seq=92 8 bytes data

                                                                                                                          ACK=100

                                                                                                                          loss

                                                                                                                          tim

                                                                                                                          eout

                                                                                                                          lost ACK scenario

                                                                                                                          Host B

                                                                                                                          X

                                                                                                                          Seq=92 8 bytes data

                                                                                                                          ACK=100

                                                                                                                          time

                                                                                                                          SendBase= 120

                                                                                                                          SendBase= 120

                                                                                                                          Sendbase= 100

                                                                                                                          Seq=

                                                                                                                          92 t

                                                                                                                          imeo

                                                                                                                          utSendBase

                                                                                                                          = 100

                                                                                                                          3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                          TCP retransmission scenarios (more)Host A

                                                                                                                          Seq=92 8 bytes data

                                                                                                                          ACK=100

                                                                                                                          loss

                                                                                                                          tim

                                                                                                                          eout

                                                                                                                          Cumulative ACK scenario

                                                                                                                          Host B

                                                                                                                          X

                                                                                                                          Seq=100 20 bytes data

                                                                                                                          ACK=120

                                                                                                                          time

                                                                                                                          SendBase= 120

                                                                                                                          3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                          TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                          Event at Receiver

                                                                                                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                          Arrival of segment that partially or completely fills gap

                                                                                                                          TCP Receiver action

                                                                                                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                          Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                          Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                          3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                          More on Sender Policies

                                                                                                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                          3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                          Fast Retransmit

                                                                                                                          Time-out period often relatively long

                                                                                                                          long delay before resending lost packet

                                                                                                                          Detect lost segments via duplicate ACKs

                                                                                                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                          fast retransmit resend segment before timer expires

                                                                                                                          3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                          Fast retransmit algorithm

                                                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                          start timer

                                                                                                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                          resend segment with sequence number y

                                                                                                                          a duplicate ACK for already ACKed segment

                                                                                                                          fast retransmit

                                                                                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                          TCP GBN or Selective Repeat

                                                                                                                          Basic TCP looks a lot like GBN

                                                                                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                          This looks a lot like Selective Repeat

                                                                                                                          TCP is a hybrid

                                                                                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                          Chapter 3 outline

                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                          TCP Flow Control

                                                                                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                          transmitting too muchtoo fast

                                                                                                                          flow controlreceive side of TCP connection has a receive buffer

                                                                                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                          app process may be slow at reading from buffer

                                                                                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                          TCP segment structure

                                                                                                                          source port dest port

                                                                                                                          32 bits

                                                                                                                          applicationdata

                                                                                                                          (variable length)

                                                                                                                          sequence numberacknowledgement number

                                                                                                                          Receive windowUrg data pnterchecksum

                                                                                                                          FSRPAUheadlen

                                                                                                                          notused

                                                                                                                          Options (variable length)

                                                                                                                          URG urgent data (generally not used)

                                                                                                                          ACK ACK valid

                                                                                                                          PSH push data now(generally not used)

                                                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                                                          commands)

                                                                                                                          bytes rcvr willingto accept

                                                                                                                          Internetchecksum

                                                                                                                          (as in UDP)

                                                                                                                          countingby bytes of data(not segments)

                                                                                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                          TCP Flow control how it works

                                                                                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                          LastByteRead]

                                                                                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                          guarantees receive buffer doesnrsquot overflow

                                                                                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                          Technical Issue

                                                                                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                          Note on UDP

                                                                                                                          UDP has no flow control

                                                                                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                          Chapter 3 outline

                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                          TCP Connection Management

                                                                                                                          Three way handshakeStep 1 client end system sends

                                                                                                                          TCP SYN control segment to server

                                                                                                                          specifies client_isn the initial seq No application data

                                                                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                          TCP Connection Management (cont)

                                                                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                          Allocate buffersAllocates buffersCan include application data

                                                                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                                                                          server

                                                                                                                          Connection granted (SYN=1 server_isn

                                                                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                                                                          ack=client_isn+1)

                                                                                                                          ack=server_isn+1

                                                                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                          TCP Connection Management (cont)

                                                                                                                          Closing a connection

                                                                                                                          client closes socketclientSocketclose()

                                                                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                          client

                                                                                                                          FIN

                                                                                                                          server

                                                                                                                          ACK

                                                                                                                          ACK

                                                                                                                          FIN

                                                                                                                          close

                                                                                                                          close

                                                                                                                          closed

                                                                                                                          tim

                                                                                                                          ed w

                                                                                                                          ait

                                                                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                          TCP Connection Management (cont)

                                                                                                                          Step 3 client receives FIN replies with ACK

                                                                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                          Closes down after timed-wait

                                                                                                                          Step 4 server receives ACK Connection closed

                                                                                                                          Note with small modification can handle simultaneous FINs

                                                                                                                          client

                                                                                                                          FIN

                                                                                                                          server

                                                                                                                          ACK

                                                                                                                          ACK

                                                                                                                          FIN

                                                                                                                          closing

                                                                                                                          closing

                                                                                                                          closed

                                                                                                                          tim

                                                                                                                          ed w

                                                                                                                          ait

                                                                                                                          closed

                                                                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                          TCP Connection Management (cont)

                                                                                                                          ExampleTCP serverlifecycle

                                                                                                                          Example TCP clientlifecycle

                                                                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                          A few special cases

                                                                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                          Chapter 3 outline

                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                          Principles of Congestion Control

                                                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                          a top-10 problem

                                                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                          large delays when congestedmaximum achievable throughput

                                                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                          Causescosts of congestion scenario 2

                                                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                          λin λout=

                                                                                                                          λin λoutgtλ

                                                                                                                          inλout

                                                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                          (c)(a) (b)

                                                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                          λin

                                                                                                                          Q what happens as and increase λ

                                                                                                                          in

                                                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                          Causescosts of congestion scenario 3

                                                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                          Approaches towards congestion control

                                                                                                                          Two broad approaches towards congestion control

                                                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                          Case study ATM ABR congestion control

                                                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                                                          small exception ndash see next page

                                                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                          sender should use available bandwidth

                                                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                          Case study ATM ABR congestion control

                                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                          Chapter 3 outline

                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                          Congwin

                                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                          cut CongWin in half after loss event

                                                                                                                          8 Kbytes

                                                                                                                          16 Kbytes

                                                                                                                          24 Kbytes

                                                                                                                          time

                                                                                                                          congestionwindow

                                                                                                                          Long-lived TCP connection

                                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                          TCP Slow Start

                                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                          TCP Slow Start (more)

                                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                                          Host A

                                                                                                                          one segment

                                                                                                                          RTT

                                                                                                                          Host B

                                                                                                                          time

                                                                                                                          two segments

                                                                                                                          four segments

                                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                          Summary TCP Congestion Control

                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                          The Big Picture

                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                          Slow Start (SS)

                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                          CongestionAvoidance (CA)

                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                          Enter slow start

                                                                                                                          Duplicate ACK

                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                          CongWin and Threshold not changed

                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                          TCP throughput

                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                          TCP Futures

                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                          LRTTMSSsdot221

                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                          TCP connection 1

                                                                                                                          bottleneckrouter

                                                                                                                          capacity R

                                                                                                                          TCP connection 2

                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                          R

                                                                                                                          R

                                                                                                                          equal bandwidth share

                                                                                                                          Connection 1 throughput

                                                                                                                          Conn

                                                                                                                          ecti

                                                                                                                          on 2

                                                                                                                          thr

                                                                                                                          ough

                                                                                                                          p ut

                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                          do not want rate throttled by congestion control

                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                          modeling slow start

                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                          Fixed congestion window (1)

                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                          latency = 2RTT + OR

                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                          Fixed congestion window (2)

                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                          Will show that the delay for one object is

                                                                                                                          RS

                                                                                                                          RSRTTP

                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                          RTT

                                                                                                                          initiate TCPconnection

                                                                                                                          requestobject

                                                                                                                          first window= SR

                                                                                                                          second window= 2SR

                                                                                                                          third window= 4SR

                                                                                                                          fourth window= 8SR

                                                                                                                          completetransmissionobject

                                                                                                                          delivered

                                                                                                                          time atclient

                                                                                                                          time atserver

                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                          Server idles P=2 times

                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                          Server idles P = minK-1Q times

                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                          TCP Latency Modeling (3)

                                                                                                                          ementacknowledg receivesserver until

                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                          RS

                                                                                                                          RSRTTPRTT

                                                                                                                          RO

                                                                                                                          RSRTT

                                                                                                                          RSRTT

                                                                                                                          RO

                                                                                                                          idleTimeRTTRO

                                                                                                                          P

                                                                                                                          kP

                                                                                                                          k

                                                                                                                          P

                                                                                                                          pp

                                                                                                                          )12(][2

                                                                                                                          ]2[2

                                                                                                                          2delay

                                                                                                                          1

                                                                                                                          1

                                                                                                                          1

                                                                                                                          minusminus+++=

                                                                                                                          minus+++=

                                                                                                                          ++=

                                                                                                                          minus

                                                                                                                          =

                                                                                                                          =

                                                                                                                          sum

                                                                                                                          sum

                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                          RS k =⎥⎦

                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                          +minus

                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                          RSk

                                                                                                                          RTT

                                                                                                                          initiate TCPconnection

                                                                                                                          requestobject

                                                                                                                          first window= SR

                                                                                                                          second window= 2SR

                                                                                                                          third window= 4SR

                                                                                                                          fourth window= 8SR

                                                                                                                          completetransmissionobject

                                                                                                                          delivered

                                                                                                                          time atclient

                                                                                                                          time atserver

                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                          How do we calculate K

                                                                                                                          ⎥⎥⎤

                                                                                                                          ⎢⎢⎡ +=

                                                                                                                          +ge=

                                                                                                                          geminus=

                                                                                                                          ge+++=

                                                                                                                          ge+++=minus

                                                                                                                          minus

                                                                                                                          )1(log

                                                                                                                          )1(logmin

                                                                                                                          12min

                                                                                                                          222min222min

                                                                                                                          2

                                                                                                                          2

                                                                                                                          110

                                                                                                                          110

                                                                                                                          SO

                                                                                                                          SOkk

                                                                                                                          SOk

                                                                                                                          SOkOSSSkK

                                                                                                                          k

                                                                                                                          k

                                                                                                                          k

                                                                                                                          L

                                                                                                                          L

                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                          02468

                                                                                                                          101214161820

                                                                                                                          28Kbps

                                                                                                                          100Kbps

                                                                                                                          1 Mbps 10Mbps

                                                                                                                          non-persistent

                                                                                                                          persistent

                                                                                                                          parallel non-persistent

                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                          HTTP Response time (in seconds)

                                                                                                                          0

                                                                                                                          10

                                                                                                                          20

                                                                                                                          30

                                                                                                                          40

                                                                                                                          50

                                                                                                                          60

                                                                                                                          70

                                                                                                                          28Kbps

                                                                                                                          100Kbps

                                                                                                                          1 Mbps 10Mbps

                                                                                                                          non-persistent

                                                                                                                          persistent

                                                                                                                          parallel non-persistent

                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                          instantiation and implementation in the Internet

                                                                                                                          UDPTCP

                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                          • Chapter 3 outline
                                                                                                                          • Transport services and protocols
                                                                                                                          • Transport vs network layer
                                                                                                                          • Transport-layer protocols
                                                                                                                          • Chapter 3 outline
                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                          • How demultiplexing works
                                                                                                                          • Connectionless demultiplexing
                                                                                                                          • Connectionless demux (cont)
                                                                                                                          • Connection-oriented demux
                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                          • Chapter 3 outline
                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                          • UDP more
                                                                                                                          • UDP checksum
                                                                                                                          • Chapter 3 outline
                                                                                                                          • Principles of Reliable data transfer
                                                                                                                          • Reliable data transfer getting started
                                                                                                                          • Reliable data transfer getting started
                                                                                                                          • Incremental Improvements
                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                          • rdt20 FSM specification
                                                                                                                          • rdt20 operation with no errors
                                                                                                                          • rdt20 error scenario
                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                          • rdt21 discussion
                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                          • rdt30 sender
                                                                                                                          • rdt30 in action
                                                                                                                          • rdt30 in action
                                                                                                                          • Performance of rdt30
                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                          • Pipelined protocols
                                                                                                                          • Pipelined protocols
                                                                                                                          • Pipelining increased utilization
                                                                                                                          • Go-Back-N
                                                                                                                          • GBN Sender
                                                                                                                          • GBN sender extended FSM
                                                                                                                          • GBN receiver extended FSM
                                                                                                                          • More on receiver
                                                                                                                          • GBN inaction
                                                                                                                          • Selective Repeat
                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                          • Selective repeat
                                                                                                                          • Selective repeat in action
                                                                                                                          • Selective repeat dilemma
                                                                                                                          • Chapter 3 outline
                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                          • More TCP Details
                                                                                                                          • Even More TCP Details
                                                                                                                          • TCP segment structure
                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                          • Example RTT estimation
                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                          • Chapter 3 outline
                                                                                                                          • TCP reliable data transfer
                                                                                                                          • TCP sender events
                                                                                                                          • TCP sender(simplified)
                                                                                                                          • TCP retransmission scenarios
                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                          • More on Sender Policies
                                                                                                                          • Fast Retransmit
                                                                                                                          • Fast retransmit algorithm
                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                          • Chapter 3 outline
                                                                                                                          • TCP Flow Control
                                                                                                                          • TCP Flow Control
                                                                                                                          • TCP segment structure
                                                                                                                          • TCP Flow control how it works
                                                                                                                          • Technical Issue
                                                                                                                          • Chapter 3 outline
                                                                                                                          • TCP Connection Management
                                                                                                                          • TCP Connection Management (cont)
                                                                                                                          • TCP Connection Management (cont)
                                                                                                                          • TCP Connection Management (cont)
                                                                                                                          • TCP Connection Management (cont)
                                                                                                                          • A few special cases
                                                                                                                          • Chapter 3 outline
                                                                                                                          • Principles of Congestion Control
                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                          • Approaches towards congestion control
                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                          • Chapter 3 outline
                                                                                                                          • TCP Congestion Control
                                                                                                                          • TCP AIMD
                                                                                                                          • TCP Slow Start
                                                                                                                          • TCP Slow Start (more)
                                                                                                                          • Summary TCP Congestion Control
                                                                                                                          • The Big Picture
                                                                                                                          • TCP sender congestion control
                                                                                                                          • TCP throughput
                                                                                                                          • TCP Futures
                                                                                                                          • TCP Fairness
                                                                                                                          • Why is TCP fair
                                                                                                                          • Fairness (more)
                                                                                                                          • TCP Latency Modeling
                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                          • Fixed congestion window (1)
                                                                                                                          • Fixed congestion window (2)
                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                          • HTTP Modeling
                                                                                                                          • Chapter 3 Summary

                                                                                                                            3 Transport Layer 62Comp 361 Spring 2005

                                                                                                                            TCP seq rsquos and ACKsSeq rsquos

                                                                                                                            byte stream ldquonumberrdquo of first byte in segmentrsquos data

                                                                                                                            ACKsseq of next byte expected from other sidecumulative ACK

                                                                                                                            Q how receiver handles out-of-order segments

                                                                                                                            A TCP spec doesnrsquot say - up to implementer

                                                                                                                            Host BHost A

                                                                                                                            Seq=42 ACK=79 data = lsquoCrsquo

                                                                                                                            Seq=79 ACK=43 data = lsquoCrsquo

                                                                                                                            Seq=43 ACK=80

                                                                                                                            Usertypes

                                                                                                                            lsquoCrsquohost ACKsreceipt oflsquoCrsquo echoes

                                                                                                                            back lsquoCrsquo

                                                                                                                            host ACKsreceipt

                                                                                                                            of echoedlsquoCrsquo

                                                                                                                            timesimple telnet scenario

                                                                                                                            3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                            TCP Round Trip Time and Timeout

                                                                                                                            Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                            ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                            average several recent measurements not just current SampleRTT

                                                                                                                            Q how to set TCP timeout valuelonger than RTT

                                                                                                                            but RTT variestoo short premature timeout

                                                                                                                            unnecessary retransmissions

                                                                                                                            too long slow reaction to segment loss

                                                                                                                            3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                            TCP Round Trip Time and Timeout

                                                                                                                            EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                            Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                            3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                            Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                            100

                                                                                                                            150

                                                                                                                            200

                                                                                                                            250

                                                                                                                            300

                                                                                                                            350

                                                                                                                            1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                            time (seconnds)

                                                                                                                            RTT

                                                                                                                            (mill

                                                                                                                            iseco

                                                                                                                            nds)

                                                                                                                            SampleRTT Estimated RTT

                                                                                                                            3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                            TCP Round Trip Time and Timeout

                                                                                                                            Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                            large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                            DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                            (typically β = 025)

                                                                                                                            Then set timeout interval

                                                                                                                            TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                            3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                            Chapter 3 outline

                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                            3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                            TCP reliable data transfer

                                                                                                                            TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                            Retransmissions are triggered by

                                                                                                                            timeout eventsduplicate acks

                                                                                                                            Initially consider simplified TCP sender

                                                                                                                            ignore duplicate acksignore flow control congestion control

                                                                                                                            3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                            TCP sender eventsdata rcvd from app

                                                                                                                            Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                            timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                            Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                            update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                            TCP sender(simplified)

                                                                                                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                            loop (forever) switch(event)

                                                                                                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                            smallest sequence numberstart timer

                                                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                            start timer

                                                                                                                            end of loop forever

                                                                                                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                            3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                            3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                            TCP retransmission scenariosHost A

                                                                                                                            Seq=100 20 bytes data

                                                                                                                            ACK=100

                                                                                                                            timepremature timeout

                                                                                                                            Host B

                                                                                                                            Seq=92 8 bytes data

                                                                                                                            ACK=120

                                                                                                                            Seq=92 8 bytes data

                                                                                                                            Seq=

                                                                                                                            92 t

                                                                                                                            imeo

                                                                                                                            ut

                                                                                                                            ACK=120

                                                                                                                            Host A

                                                                                                                            Seq=92 8 bytes data

                                                                                                                            ACK=100

                                                                                                                            loss

                                                                                                                            tim

                                                                                                                            eout

                                                                                                                            lost ACK scenario

                                                                                                                            Host B

                                                                                                                            X

                                                                                                                            Seq=92 8 bytes data

                                                                                                                            ACK=100

                                                                                                                            time

                                                                                                                            SendBase= 120

                                                                                                                            SendBase= 120

                                                                                                                            Sendbase= 100

                                                                                                                            Seq=

                                                                                                                            92 t

                                                                                                                            imeo

                                                                                                                            utSendBase

                                                                                                                            = 100

                                                                                                                            3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                            TCP retransmission scenarios (more)Host A

                                                                                                                            Seq=92 8 bytes data

                                                                                                                            ACK=100

                                                                                                                            loss

                                                                                                                            tim

                                                                                                                            eout

                                                                                                                            Cumulative ACK scenario

                                                                                                                            Host B

                                                                                                                            X

                                                                                                                            Seq=100 20 bytes data

                                                                                                                            ACK=120

                                                                                                                            time

                                                                                                                            SendBase= 120

                                                                                                                            3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                            TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                            Event at Receiver

                                                                                                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                            Arrival of segment that partially or completely fills gap

                                                                                                                            TCP Receiver action

                                                                                                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                            Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                            Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                            3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                            More on Sender Policies

                                                                                                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                            3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                            Fast Retransmit

                                                                                                                            Time-out period often relatively long

                                                                                                                            long delay before resending lost packet

                                                                                                                            Detect lost segments via duplicate ACKs

                                                                                                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                            fast retransmit resend segment before timer expires

                                                                                                                            3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                            Fast retransmit algorithm

                                                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                            start timer

                                                                                                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                            resend segment with sequence number y

                                                                                                                            a duplicate ACK for already ACKed segment

                                                                                                                            fast retransmit

                                                                                                                            3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                            TCP GBN or Selective Repeat

                                                                                                                            Basic TCP looks a lot like GBN

                                                                                                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                            This looks a lot like Selective Repeat

                                                                                                                            TCP is a hybrid

                                                                                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                            Chapter 3 outline

                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                            TCP Flow Control

                                                                                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                            transmitting too muchtoo fast

                                                                                                                            flow controlreceive side of TCP connection has a receive buffer

                                                                                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                            app process may be slow at reading from buffer

                                                                                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                            TCP segment structure

                                                                                                                            source port dest port

                                                                                                                            32 bits

                                                                                                                            applicationdata

                                                                                                                            (variable length)

                                                                                                                            sequence numberacknowledgement number

                                                                                                                            Receive windowUrg data pnterchecksum

                                                                                                                            FSRPAUheadlen

                                                                                                                            notused

                                                                                                                            Options (variable length)

                                                                                                                            URG urgent data (generally not used)

                                                                                                                            ACK ACK valid

                                                                                                                            PSH push data now(generally not used)

                                                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                                                            commands)

                                                                                                                            bytes rcvr willingto accept

                                                                                                                            Internetchecksum

                                                                                                                            (as in UDP)

                                                                                                                            countingby bytes of data(not segments)

                                                                                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                            TCP Flow control how it works

                                                                                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                            LastByteRead]

                                                                                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                            guarantees receive buffer doesnrsquot overflow

                                                                                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                            Technical Issue

                                                                                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                            Note on UDP

                                                                                                                            UDP has no flow control

                                                                                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                            Chapter 3 outline

                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                            TCP Connection Management

                                                                                                                            Three way handshakeStep 1 client end system sends

                                                                                                                            TCP SYN control segment to server

                                                                                                                            specifies client_isn the initial seq No application data

                                                                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                            TCP Connection Management (cont)

                                                                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                            Allocate buffersAllocates buffersCan include application data

                                                                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                                                                            server

                                                                                                                            Connection granted (SYN=1 server_isn

                                                                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                                                                            ack=client_isn+1)

                                                                                                                            ack=server_isn+1

                                                                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                            TCP Connection Management (cont)

                                                                                                                            Closing a connection

                                                                                                                            client closes socketclientSocketclose()

                                                                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                            client

                                                                                                                            FIN

                                                                                                                            server

                                                                                                                            ACK

                                                                                                                            ACK

                                                                                                                            FIN

                                                                                                                            close

                                                                                                                            close

                                                                                                                            closed

                                                                                                                            tim

                                                                                                                            ed w

                                                                                                                            ait

                                                                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                            TCP Connection Management (cont)

                                                                                                                            Step 3 client receives FIN replies with ACK

                                                                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                            Closes down after timed-wait

                                                                                                                            Step 4 server receives ACK Connection closed

                                                                                                                            Note with small modification can handle simultaneous FINs

                                                                                                                            client

                                                                                                                            FIN

                                                                                                                            server

                                                                                                                            ACK

                                                                                                                            ACK

                                                                                                                            FIN

                                                                                                                            closing

                                                                                                                            closing

                                                                                                                            closed

                                                                                                                            tim

                                                                                                                            ed w

                                                                                                                            ait

                                                                                                                            closed

                                                                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                            TCP Connection Management (cont)

                                                                                                                            ExampleTCP serverlifecycle

                                                                                                                            Example TCP clientlifecycle

                                                                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                            A few special cases

                                                                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                            Chapter 3 outline

                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                            Principles of Congestion Control

                                                                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                            a top-10 problem

                                                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                            large delays when congestedmaximum achievable throughput

                                                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                            Causescosts of congestion scenario 2

                                                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                            λin λout=

                                                                                                                            λin λoutgtλ

                                                                                                                            inλout

                                                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                            (c)(a) (b)

                                                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                            λin

                                                                                                                            Q what happens as and increase λ

                                                                                                                            in

                                                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                            Causescosts of congestion scenario 3

                                                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                            Approaches towards congestion control

                                                                                                                            Two broad approaches towards congestion control

                                                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                            Case study ATM ABR congestion control

                                                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                                                            small exception ndash see next page

                                                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                            sender should use available bandwidth

                                                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                            Case study ATM ABR congestion control

                                                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                            Chapter 3 outline

                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                            Congwin

                                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                            cut CongWin in half after loss event

                                                                                                                            8 Kbytes

                                                                                                                            16 Kbytes

                                                                                                                            24 Kbytes

                                                                                                                            time

                                                                                                                            congestionwindow

                                                                                                                            Long-lived TCP connection

                                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                            TCP Slow Start

                                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                            TCP Slow Start (more)

                                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                                            Host A

                                                                                                                            one segment

                                                                                                                            RTT

                                                                                                                            Host B

                                                                                                                            time

                                                                                                                            two segments

                                                                                                                            four segments

                                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                            Summary TCP Congestion Control

                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                            The Big Picture

                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                            Slow Start (SS)

                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                            CongestionAvoidance (CA)

                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                            Enter slow start

                                                                                                                            Duplicate ACK

                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                            CongWin and Threshold not changed

                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                            TCP throughput

                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                            TCP Futures

                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                            LRTTMSSsdot221

                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                            TCP connection 1

                                                                                                                            bottleneckrouter

                                                                                                                            capacity R

                                                                                                                            TCP connection 2

                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                            R

                                                                                                                            R

                                                                                                                            equal bandwidth share

                                                                                                                            Connection 1 throughput

                                                                                                                            Conn

                                                                                                                            ecti

                                                                                                                            on 2

                                                                                                                            thr

                                                                                                                            ough

                                                                                                                            p ut

                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                            do not want rate throttled by congestion control

                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                            modeling slow start

                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                            Fixed congestion window (1)

                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                            latency = 2RTT + OR

                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                            Fixed congestion window (2)

                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                            Will show that the delay for one object is

                                                                                                                            RS

                                                                                                                            RSRTTP

                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                            RTT

                                                                                                                            initiate TCPconnection

                                                                                                                            requestobject

                                                                                                                            first window= SR

                                                                                                                            second window= 2SR

                                                                                                                            third window= 4SR

                                                                                                                            fourth window= 8SR

                                                                                                                            completetransmissionobject

                                                                                                                            delivered

                                                                                                                            time atclient

                                                                                                                            time atserver

                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                            Server idles P=2 times

                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                            Server idles P = minK-1Q times

                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                            TCP Latency Modeling (3)

                                                                                                                            ementacknowledg receivesserver until

                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                            RS

                                                                                                                            RSRTTPRTT

                                                                                                                            RO

                                                                                                                            RSRTT

                                                                                                                            RSRTT

                                                                                                                            RO

                                                                                                                            idleTimeRTTRO

                                                                                                                            P

                                                                                                                            kP

                                                                                                                            k

                                                                                                                            P

                                                                                                                            pp

                                                                                                                            )12(][2

                                                                                                                            ]2[2

                                                                                                                            2delay

                                                                                                                            1

                                                                                                                            1

                                                                                                                            1

                                                                                                                            minusminus+++=

                                                                                                                            minus+++=

                                                                                                                            ++=

                                                                                                                            minus

                                                                                                                            =

                                                                                                                            =

                                                                                                                            sum

                                                                                                                            sum

                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                            RS k =⎥⎦

                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                            +minus

                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                            RSk

                                                                                                                            RTT

                                                                                                                            initiate TCPconnection

                                                                                                                            requestobject

                                                                                                                            first window= SR

                                                                                                                            second window= 2SR

                                                                                                                            third window= 4SR

                                                                                                                            fourth window= 8SR

                                                                                                                            completetransmissionobject

                                                                                                                            delivered

                                                                                                                            time atclient

                                                                                                                            time atserver

                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                            How do we calculate K

                                                                                                                            ⎥⎥⎤

                                                                                                                            ⎢⎢⎡ +=

                                                                                                                            +ge=

                                                                                                                            geminus=

                                                                                                                            ge+++=

                                                                                                                            ge+++=minus

                                                                                                                            minus

                                                                                                                            )1(log

                                                                                                                            )1(logmin

                                                                                                                            12min

                                                                                                                            222min222min

                                                                                                                            2

                                                                                                                            2

                                                                                                                            110

                                                                                                                            110

                                                                                                                            SO

                                                                                                                            SOkk

                                                                                                                            SOk

                                                                                                                            SOkOSSSkK

                                                                                                                            k

                                                                                                                            k

                                                                                                                            k

                                                                                                                            L

                                                                                                                            L

                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                            02468

                                                                                                                            101214161820

                                                                                                                            28Kbps

                                                                                                                            100Kbps

                                                                                                                            1 Mbps 10Mbps

                                                                                                                            non-persistent

                                                                                                                            persistent

                                                                                                                            parallel non-persistent

                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                            HTTP Response time (in seconds)

                                                                                                                            0

                                                                                                                            10

                                                                                                                            20

                                                                                                                            30

                                                                                                                            40

                                                                                                                            50

                                                                                                                            60

                                                                                                                            70

                                                                                                                            28Kbps

                                                                                                                            100Kbps

                                                                                                                            1 Mbps 10Mbps

                                                                                                                            non-persistent

                                                                                                                            persistent

                                                                                                                            parallel non-persistent

                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                            instantiation and implementation in the Internet

                                                                                                                            UDPTCP

                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                            • Chapter 3 outline
                                                                                                                            • Transport services and protocols
                                                                                                                            • Transport vs network layer
                                                                                                                            • Transport-layer protocols
                                                                                                                            • Chapter 3 outline
                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                            • How demultiplexing works
                                                                                                                            • Connectionless demultiplexing
                                                                                                                            • Connectionless demux (cont)
                                                                                                                            • Connection-oriented demux
                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                            • Chapter 3 outline
                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                            • UDP more
                                                                                                                            • UDP checksum
                                                                                                                            • Chapter 3 outline
                                                                                                                            • Principles of Reliable data transfer
                                                                                                                            • Reliable data transfer getting started
                                                                                                                            • Reliable data transfer getting started
                                                                                                                            • Incremental Improvements
                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                            • rdt20 FSM specification
                                                                                                                            • rdt20 operation with no errors
                                                                                                                            • rdt20 error scenario
                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                            • rdt21 discussion
                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                            • rdt30 sender
                                                                                                                            • rdt30 in action
                                                                                                                            • rdt30 in action
                                                                                                                            • Performance of rdt30
                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                            • Pipelined protocols
                                                                                                                            • Pipelined protocols
                                                                                                                            • Pipelining increased utilization
                                                                                                                            • Go-Back-N
                                                                                                                            • GBN Sender
                                                                                                                            • GBN sender extended FSM
                                                                                                                            • GBN receiver extended FSM
                                                                                                                            • More on receiver
                                                                                                                            • GBN inaction
                                                                                                                            • Selective Repeat
                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                            • Selective repeat
                                                                                                                            • Selective repeat in action
                                                                                                                            • Selective repeat dilemma
                                                                                                                            • Chapter 3 outline
                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                            • More TCP Details
                                                                                                                            • Even More TCP Details
                                                                                                                            • TCP segment structure
                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                            • Example RTT estimation
                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                            • Chapter 3 outline
                                                                                                                            • TCP reliable data transfer
                                                                                                                            • TCP sender events
                                                                                                                            • TCP sender(simplified)
                                                                                                                            • TCP retransmission scenarios
                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                            • More on Sender Policies
                                                                                                                            • Fast Retransmit
                                                                                                                            • Fast retransmit algorithm
                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                            • Chapter 3 outline
                                                                                                                            • TCP Flow Control
                                                                                                                            • TCP Flow Control
                                                                                                                            • TCP segment structure
                                                                                                                            • TCP Flow control how it works
                                                                                                                            • Technical Issue
                                                                                                                            • Chapter 3 outline
                                                                                                                            • TCP Connection Management
                                                                                                                            • TCP Connection Management (cont)
                                                                                                                            • TCP Connection Management (cont)
                                                                                                                            • TCP Connection Management (cont)
                                                                                                                            • TCP Connection Management (cont)
                                                                                                                            • A few special cases
                                                                                                                            • Chapter 3 outline
                                                                                                                            • Principles of Congestion Control
                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                            • Approaches towards congestion control
                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                            • Chapter 3 outline
                                                                                                                            • TCP Congestion Control
                                                                                                                            • TCP AIMD
                                                                                                                            • TCP Slow Start
                                                                                                                            • TCP Slow Start (more)
                                                                                                                            • Summary TCP Congestion Control
                                                                                                                            • The Big Picture
                                                                                                                            • TCP sender congestion control
                                                                                                                            • TCP throughput
                                                                                                                            • TCP Futures
                                                                                                                            • TCP Fairness
                                                                                                                            • Why is TCP fair
                                                                                                                            • Fairness (more)
                                                                                                                            • TCP Latency Modeling
                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                            • Fixed congestion window (1)
                                                                                                                            • Fixed congestion window (2)
                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                            • HTTP Modeling
                                                                                                                            • Chapter 3 Summary

                                                                                                                              3 Transport Layer 63Comp 361 Spring 2005

                                                                                                                              TCP Round Trip Time and Timeout

                                                                                                                              Q how to estimate RTTSampleRTT measured time from segment transmission until ACK receipt

                                                                                                                              ignore retransmissionsSampleRTT will vary want estimated RTT ldquosmootherrdquo

                                                                                                                              average several recent measurements not just current SampleRTT

                                                                                                                              Q how to set TCP timeout valuelonger than RTT

                                                                                                                              but RTT variestoo short premature timeout

                                                                                                                              unnecessary retransmissions

                                                                                                                              too long slow reaction to segment loss

                                                                                                                              3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                              TCP Round Trip Time and Timeout

                                                                                                                              EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                              Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                              3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                              Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                              100

                                                                                                                              150

                                                                                                                              200

                                                                                                                              250

                                                                                                                              300

                                                                                                                              350

                                                                                                                              1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                              time (seconnds)

                                                                                                                              RTT

                                                                                                                              (mill

                                                                                                                              iseco

                                                                                                                              nds)

                                                                                                                              SampleRTT Estimated RTT

                                                                                                                              3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                              TCP Round Trip Time and Timeout

                                                                                                                              Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                              large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                              DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                              (typically β = 025)

                                                                                                                              Then set timeout interval

                                                                                                                              TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                              3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                              Chapter 3 outline

                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                              3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                              TCP reliable data transfer

                                                                                                                              TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                              Retransmissions are triggered by

                                                                                                                              timeout eventsduplicate acks

                                                                                                                              Initially consider simplified TCP sender

                                                                                                                              ignore duplicate acksignore flow control congestion control

                                                                                                                              3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                              TCP sender eventsdata rcvd from app

                                                                                                                              Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                              timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                              Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                              update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                              TCP sender(simplified)

                                                                                                                              NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                              loop (forever) switch(event)

                                                                                                                              event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                              start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                              event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                              smallest sequence numberstart timer

                                                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                              start timer

                                                                                                                              end of loop forever

                                                                                                                              Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                              3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                              3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                              TCP retransmission scenariosHost A

                                                                                                                              Seq=100 20 bytes data

                                                                                                                              ACK=100

                                                                                                                              timepremature timeout

                                                                                                                              Host B

                                                                                                                              Seq=92 8 bytes data

                                                                                                                              ACK=120

                                                                                                                              Seq=92 8 bytes data

                                                                                                                              Seq=

                                                                                                                              92 t

                                                                                                                              imeo

                                                                                                                              ut

                                                                                                                              ACK=120

                                                                                                                              Host A

                                                                                                                              Seq=92 8 bytes data

                                                                                                                              ACK=100

                                                                                                                              loss

                                                                                                                              tim

                                                                                                                              eout

                                                                                                                              lost ACK scenario

                                                                                                                              Host B

                                                                                                                              X

                                                                                                                              Seq=92 8 bytes data

                                                                                                                              ACK=100

                                                                                                                              time

                                                                                                                              SendBase= 120

                                                                                                                              SendBase= 120

                                                                                                                              Sendbase= 100

                                                                                                                              Seq=

                                                                                                                              92 t

                                                                                                                              imeo

                                                                                                                              utSendBase

                                                                                                                              = 100

                                                                                                                              3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                              TCP retransmission scenarios (more)Host A

                                                                                                                              Seq=92 8 bytes data

                                                                                                                              ACK=100

                                                                                                                              loss

                                                                                                                              tim

                                                                                                                              eout

                                                                                                                              Cumulative ACK scenario

                                                                                                                              Host B

                                                                                                                              X

                                                                                                                              Seq=100 20 bytes data

                                                                                                                              ACK=120

                                                                                                                              time

                                                                                                                              SendBase= 120

                                                                                                                              3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                              TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                              Event at Receiver

                                                                                                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                              Arrival of segment that partially or completely fills gap

                                                                                                                              TCP Receiver action

                                                                                                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                              Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                              Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                              3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                              More on Sender Policies

                                                                                                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                              3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                              Fast Retransmit

                                                                                                                              Time-out period often relatively long

                                                                                                                              long delay before resending lost packet

                                                                                                                              Detect lost segments via duplicate ACKs

                                                                                                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                              fast retransmit resend segment before timer expires

                                                                                                                              3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                              Fast retransmit algorithm

                                                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                              start timer

                                                                                                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                              resend segment with sequence number y

                                                                                                                              a duplicate ACK for already ACKed segment

                                                                                                                              fast retransmit

                                                                                                                              3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                              TCP GBN or Selective Repeat

                                                                                                                              Basic TCP looks a lot like GBN

                                                                                                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                              This looks a lot like Selective Repeat

                                                                                                                              TCP is a hybrid

                                                                                                                              3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                              Chapter 3 outline

                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                              TCP Flow Control

                                                                                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                              transmitting too muchtoo fast

                                                                                                                              flow controlreceive side of TCP connection has a receive buffer

                                                                                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                              app process may be slow at reading from buffer

                                                                                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                              TCP segment structure

                                                                                                                              source port dest port

                                                                                                                              32 bits

                                                                                                                              applicationdata

                                                                                                                              (variable length)

                                                                                                                              sequence numberacknowledgement number

                                                                                                                              Receive windowUrg data pnterchecksum

                                                                                                                              FSRPAUheadlen

                                                                                                                              notused

                                                                                                                              Options (variable length)

                                                                                                                              URG urgent data (generally not used)

                                                                                                                              ACK ACK valid

                                                                                                                              PSH push data now(generally not used)

                                                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                                                              commands)

                                                                                                                              bytes rcvr willingto accept

                                                                                                                              Internetchecksum

                                                                                                                              (as in UDP)

                                                                                                                              countingby bytes of data(not segments)

                                                                                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                              TCP Flow control how it works

                                                                                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                              LastByteRead]

                                                                                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                              guarantees receive buffer doesnrsquot overflow

                                                                                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                              Technical Issue

                                                                                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                              Note on UDP

                                                                                                                              UDP has no flow control

                                                                                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                              Chapter 3 outline

                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                              TCP Connection Management

                                                                                                                              Three way handshakeStep 1 client end system sends

                                                                                                                              TCP SYN control segment to server

                                                                                                                              specifies client_isn the initial seq No application data

                                                                                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                              seq sbuffers flow control info (eg RcvWindow)

                                                                                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                              TCP Connection Management (cont)

                                                                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                              Allocate buffersAllocates buffersCan include application data

                                                                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                                                                              server

                                                                                                                              Connection granted (SYN=1 server_isn

                                                                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                                                                              ack=client_isn+1)

                                                                                                                              ack=server_isn+1

                                                                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                              TCP Connection Management (cont)

                                                                                                                              Closing a connection

                                                                                                                              client closes socketclientSocketclose()

                                                                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                              client

                                                                                                                              FIN

                                                                                                                              server

                                                                                                                              ACK

                                                                                                                              ACK

                                                                                                                              FIN

                                                                                                                              close

                                                                                                                              close

                                                                                                                              closed

                                                                                                                              tim

                                                                                                                              ed w

                                                                                                                              ait

                                                                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                              TCP Connection Management (cont)

                                                                                                                              Step 3 client receives FIN replies with ACK

                                                                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                              Closes down after timed-wait

                                                                                                                              Step 4 server receives ACK Connection closed

                                                                                                                              Note with small modification can handle simultaneous FINs

                                                                                                                              client

                                                                                                                              FIN

                                                                                                                              server

                                                                                                                              ACK

                                                                                                                              ACK

                                                                                                                              FIN

                                                                                                                              closing

                                                                                                                              closing

                                                                                                                              closed

                                                                                                                              tim

                                                                                                                              ed w

                                                                                                                              ait

                                                                                                                              closed

                                                                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                              TCP Connection Management (cont)

                                                                                                                              ExampleTCP serverlifecycle

                                                                                                                              Example TCP clientlifecycle

                                                                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                              A few special cases

                                                                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                              Chapter 3 outline

                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                              Principles of Congestion Control

                                                                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                              a top-10 problem

                                                                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                              large delays when congestedmaximum achievable throughput

                                                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                              Causescosts of congestion scenario 2

                                                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                              λin λout=

                                                                                                                              λin λoutgtλ

                                                                                                                              inλout

                                                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                              (c)(a) (b)

                                                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                              λin

                                                                                                                              Q what happens as and increase λ

                                                                                                                              in

                                                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                              Causescosts of congestion scenario 3

                                                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                              Approaches towards congestion control

                                                                                                                              Two broad approaches towards congestion control

                                                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                              Case study ATM ABR congestion control

                                                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                                                              small exception ndash see next page

                                                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                              sender should use available bandwidth

                                                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                              Case study ATM ABR congestion control

                                                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                              Chapter 3 outline

                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                              Congwin

                                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                              cut CongWin in half after loss event

                                                                                                                              8 Kbytes

                                                                                                                              16 Kbytes

                                                                                                                              24 Kbytes

                                                                                                                              time

                                                                                                                              congestionwindow

                                                                                                                              Long-lived TCP connection

                                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                              TCP Slow Start

                                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                              TCP Slow Start (more)

                                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                                              Host A

                                                                                                                              one segment

                                                                                                                              RTT

                                                                                                                              Host B

                                                                                                                              time

                                                                                                                              two segments

                                                                                                                              four segments

                                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                              Summary TCP Congestion Control

                                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                              The Big Picture

                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                              Slow Start (SS)

                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                              CongestionAvoidance (CA)

                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                              Enter slow start

                                                                                                                              Duplicate ACK

                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                              CongWin and Threshold not changed

                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                              TCP throughput

                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                              TCP Futures

                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                              LRTTMSSsdot221

                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                              TCP connection 1

                                                                                                                              bottleneckrouter

                                                                                                                              capacity R

                                                                                                                              TCP connection 2

                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                              R

                                                                                                                              R

                                                                                                                              equal bandwidth share

                                                                                                                              Connection 1 throughput

                                                                                                                              Conn

                                                                                                                              ecti

                                                                                                                              on 2

                                                                                                                              thr

                                                                                                                              ough

                                                                                                                              p ut

                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                              do not want rate throttled by congestion control

                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                              modeling slow start

                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                              Fixed congestion window (1)

                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                              latency = 2RTT + OR

                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                              Fixed congestion window (2)

                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                              Will show that the delay for one object is

                                                                                                                              RS

                                                                                                                              RSRTTP

                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                              RTT

                                                                                                                              initiate TCPconnection

                                                                                                                              requestobject

                                                                                                                              first window= SR

                                                                                                                              second window= 2SR

                                                                                                                              third window= 4SR

                                                                                                                              fourth window= 8SR

                                                                                                                              completetransmissionobject

                                                                                                                              delivered

                                                                                                                              time atclient

                                                                                                                              time atserver

                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                              Server idles P=2 times

                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                              Server idles P = minK-1Q times

                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                              TCP Latency Modeling (3)

                                                                                                                              ementacknowledg receivesserver until

                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                              RS

                                                                                                                              RSRTTPRTT

                                                                                                                              RO

                                                                                                                              RSRTT

                                                                                                                              RSRTT

                                                                                                                              RO

                                                                                                                              idleTimeRTTRO

                                                                                                                              P

                                                                                                                              kP

                                                                                                                              k

                                                                                                                              P

                                                                                                                              pp

                                                                                                                              )12(][2

                                                                                                                              ]2[2

                                                                                                                              2delay

                                                                                                                              1

                                                                                                                              1

                                                                                                                              1

                                                                                                                              minusminus+++=

                                                                                                                              minus+++=

                                                                                                                              ++=

                                                                                                                              minus

                                                                                                                              =

                                                                                                                              =

                                                                                                                              sum

                                                                                                                              sum

                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                              RS k =⎥⎦

                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                              +minus

                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                              RSk

                                                                                                                              RTT

                                                                                                                              initiate TCPconnection

                                                                                                                              requestobject

                                                                                                                              first window= SR

                                                                                                                              second window= 2SR

                                                                                                                              third window= 4SR

                                                                                                                              fourth window= 8SR

                                                                                                                              completetransmissionobject

                                                                                                                              delivered

                                                                                                                              time atclient

                                                                                                                              time atserver

                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                              How do we calculate K

                                                                                                                              ⎥⎥⎤

                                                                                                                              ⎢⎢⎡ +=

                                                                                                                              +ge=

                                                                                                                              geminus=

                                                                                                                              ge+++=

                                                                                                                              ge+++=minus

                                                                                                                              minus

                                                                                                                              )1(log

                                                                                                                              )1(logmin

                                                                                                                              12min

                                                                                                                              222min222min

                                                                                                                              2

                                                                                                                              2

                                                                                                                              110

                                                                                                                              110

                                                                                                                              SO

                                                                                                                              SOkk

                                                                                                                              SOk

                                                                                                                              SOkOSSSkK

                                                                                                                              k

                                                                                                                              k

                                                                                                                              k

                                                                                                                              L

                                                                                                                              L

                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                              02468

                                                                                                                              101214161820

                                                                                                                              28Kbps

                                                                                                                              100Kbps

                                                                                                                              1 Mbps 10Mbps

                                                                                                                              non-persistent

                                                                                                                              persistent

                                                                                                                              parallel non-persistent

                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                              HTTP Response time (in seconds)

                                                                                                                              0

                                                                                                                              10

                                                                                                                              20

                                                                                                                              30

                                                                                                                              40

                                                                                                                              50

                                                                                                                              60

                                                                                                                              70

                                                                                                                              28Kbps

                                                                                                                              100Kbps

                                                                                                                              1 Mbps 10Mbps

                                                                                                                              non-persistent

                                                                                                                              persistent

                                                                                                                              parallel non-persistent

                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                              instantiation and implementation in the Internet

                                                                                                                              UDPTCP

                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                              • Chapter 3 outline
                                                                                                                              • Transport services and protocols
                                                                                                                              • Transport vs network layer
                                                                                                                              • Transport-layer protocols
                                                                                                                              • Chapter 3 outline
                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                              • How demultiplexing works
                                                                                                                              • Connectionless demultiplexing
                                                                                                                              • Connectionless demux (cont)
                                                                                                                              • Connection-oriented demux
                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                              • Chapter 3 outline
                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                              • UDP more
                                                                                                                              • UDP checksum
                                                                                                                              • Chapter 3 outline
                                                                                                                              • Principles of Reliable data transfer
                                                                                                                              • Reliable data transfer getting started
                                                                                                                              • Reliable data transfer getting started
                                                                                                                              • Incremental Improvements
                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                              • rdt20 FSM specification
                                                                                                                              • rdt20 operation with no errors
                                                                                                                              • rdt20 error scenario
                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                              • rdt21 discussion
                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                              • rdt30 sender
                                                                                                                              • rdt30 in action
                                                                                                                              • rdt30 in action
                                                                                                                              • Performance of rdt30
                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                              • Pipelined protocols
                                                                                                                              • Pipelined protocols
                                                                                                                              • Pipelining increased utilization
                                                                                                                              • Go-Back-N
                                                                                                                              • GBN Sender
                                                                                                                              • GBN sender extended FSM
                                                                                                                              • GBN receiver extended FSM
                                                                                                                              • More on receiver
                                                                                                                              • GBN inaction
                                                                                                                              • Selective Repeat
                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                              • Selective repeat
                                                                                                                              • Selective repeat in action
                                                                                                                              • Selective repeat dilemma
                                                                                                                              • Chapter 3 outline
                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                              • More TCP Details
                                                                                                                              • Even More TCP Details
                                                                                                                              • TCP segment structure
                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                              • Example RTT estimation
                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                              • Chapter 3 outline
                                                                                                                              • TCP reliable data transfer
                                                                                                                              • TCP sender events
                                                                                                                              • TCP sender(simplified)
                                                                                                                              • TCP retransmission scenarios
                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                              • More on Sender Policies
                                                                                                                              • Fast Retransmit
                                                                                                                              • Fast retransmit algorithm
                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                              • Chapter 3 outline
                                                                                                                              • TCP Flow Control
                                                                                                                              • TCP Flow Control
                                                                                                                              • TCP segment structure
                                                                                                                              • TCP Flow control how it works
                                                                                                                              • Technical Issue
                                                                                                                              • Chapter 3 outline
                                                                                                                              • TCP Connection Management
                                                                                                                              • TCP Connection Management (cont)
                                                                                                                              • TCP Connection Management (cont)
                                                                                                                              • TCP Connection Management (cont)
                                                                                                                              • TCP Connection Management (cont)
                                                                                                                              • A few special cases
                                                                                                                              • Chapter 3 outline
                                                                                                                              • Principles of Congestion Control
                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                              • Approaches towards congestion control
                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                              • Chapter 3 outline
                                                                                                                              • TCP Congestion Control
                                                                                                                              • TCP AIMD
                                                                                                                              • TCP Slow Start
                                                                                                                              • TCP Slow Start (more)
                                                                                                                              • Summary TCP Congestion Control
                                                                                                                              • The Big Picture
                                                                                                                              • TCP sender congestion control
                                                                                                                              • TCP throughput
                                                                                                                              • TCP Futures
                                                                                                                              • TCP Fairness
                                                                                                                              • Why is TCP fair
                                                                                                                              • Fairness (more)
                                                                                                                              • TCP Latency Modeling
                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                              • Fixed congestion window (1)
                                                                                                                              • Fixed congestion window (2)
                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                              • HTTP Modeling
                                                                                                                              • Chapter 3 Summary

                                                                                                                                3 Transport Layer 64Comp 361 Spring 2005

                                                                                                                                TCP Round Trip Time and Timeout

                                                                                                                                EstimatedRTT = (1- α)EstimatedRTT + αSampleRTT

                                                                                                                                Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value α = 0125

                                                                                                                                3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                                Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                                100

                                                                                                                                150

                                                                                                                                200

                                                                                                                                250

                                                                                                                                300

                                                                                                                                350

                                                                                                                                1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                                time (seconnds)

                                                                                                                                RTT

                                                                                                                                (mill

                                                                                                                                iseco

                                                                                                                                nds)

                                                                                                                                SampleRTT Estimated RTT

                                                                                                                                3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                                TCP Round Trip Time and Timeout

                                                                                                                                Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                                large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                                DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                                (typically β = 025)

                                                                                                                                Then set timeout interval

                                                                                                                                TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                                3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                                Chapter 3 outline

                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                                TCP reliable data transfer

                                                                                                                                TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                                Retransmissions are triggered by

                                                                                                                                timeout eventsduplicate acks

                                                                                                                                Initially consider simplified TCP sender

                                                                                                                                ignore duplicate acksignore flow control congestion control

                                                                                                                                3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                                TCP sender eventsdata rcvd from app

                                                                                                                                Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                                timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                                Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                                update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                                TCP sender(simplified)

                                                                                                                                NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                loop (forever) switch(event)

                                                                                                                                event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                smallest sequence numberstart timer

                                                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                start timer

                                                                                                                                end of loop forever

                                                                                                                                Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                TCP retransmission scenariosHost A

                                                                                                                                Seq=100 20 bytes data

                                                                                                                                ACK=100

                                                                                                                                timepremature timeout

                                                                                                                                Host B

                                                                                                                                Seq=92 8 bytes data

                                                                                                                                ACK=120

                                                                                                                                Seq=92 8 bytes data

                                                                                                                                Seq=

                                                                                                                                92 t

                                                                                                                                imeo

                                                                                                                                ut

                                                                                                                                ACK=120

                                                                                                                                Host A

                                                                                                                                Seq=92 8 bytes data

                                                                                                                                ACK=100

                                                                                                                                loss

                                                                                                                                tim

                                                                                                                                eout

                                                                                                                                lost ACK scenario

                                                                                                                                Host B

                                                                                                                                X

                                                                                                                                Seq=92 8 bytes data

                                                                                                                                ACK=100

                                                                                                                                time

                                                                                                                                SendBase= 120

                                                                                                                                SendBase= 120

                                                                                                                                Sendbase= 100

                                                                                                                                Seq=

                                                                                                                                92 t

                                                                                                                                imeo

                                                                                                                                utSendBase

                                                                                                                                = 100

                                                                                                                                3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                TCP retransmission scenarios (more)Host A

                                                                                                                                Seq=92 8 bytes data

                                                                                                                                ACK=100

                                                                                                                                loss

                                                                                                                                tim

                                                                                                                                eout

                                                                                                                                Cumulative ACK scenario

                                                                                                                                Host B

                                                                                                                                X

                                                                                                                                Seq=100 20 bytes data

                                                                                                                                ACK=120

                                                                                                                                time

                                                                                                                                SendBase= 120

                                                                                                                                3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                Event at Receiver

                                                                                                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                Arrival of segment that partially or completely fills gap

                                                                                                                                TCP Receiver action

                                                                                                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                More on Sender Policies

                                                                                                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                Fast Retransmit

                                                                                                                                Time-out period often relatively long

                                                                                                                                long delay before resending lost packet

                                                                                                                                Detect lost segments via duplicate ACKs

                                                                                                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                fast retransmit resend segment before timer expires

                                                                                                                                3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                Fast retransmit algorithm

                                                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                start timer

                                                                                                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                resend segment with sequence number y

                                                                                                                                a duplicate ACK for already ACKed segment

                                                                                                                                fast retransmit

                                                                                                                                3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                TCP GBN or Selective Repeat

                                                                                                                                Basic TCP looks a lot like GBN

                                                                                                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                This looks a lot like Selective Repeat

                                                                                                                                TCP is a hybrid

                                                                                                                                3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                Chapter 3 outline

                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                TCP Flow Control

                                                                                                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                transmitting too muchtoo fast

                                                                                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                app process may be slow at reading from buffer

                                                                                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                TCP segment structure

                                                                                                                                source port dest port

                                                                                                                                32 bits

                                                                                                                                applicationdata

                                                                                                                                (variable length)

                                                                                                                                sequence numberacknowledgement number

                                                                                                                                Receive windowUrg data pnterchecksum

                                                                                                                                FSRPAUheadlen

                                                                                                                                notused

                                                                                                                                Options (variable length)

                                                                                                                                URG urgent data (generally not used)

                                                                                                                                ACK ACK valid

                                                                                                                                PSH push data now(generally not used)

                                                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                                                commands)

                                                                                                                                bytes rcvr willingto accept

                                                                                                                                Internetchecksum

                                                                                                                                (as in UDP)

                                                                                                                                countingby bytes of data(not segments)

                                                                                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                TCP Flow control how it works

                                                                                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                LastByteRead]

                                                                                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                guarantees receive buffer doesnrsquot overflow

                                                                                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                Technical Issue

                                                                                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                Note on UDP

                                                                                                                                UDP has no flow control

                                                                                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                Chapter 3 outline

                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                TCP Connection Management

                                                                                                                                Three way handshakeStep 1 client end system sends

                                                                                                                                TCP SYN control segment to server

                                                                                                                                specifies client_isn the initial seq No application data

                                                                                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                TCP Connection Management (cont)

                                                                                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                Allocate buffersAllocates buffersCan include application data

                                                                                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                server

                                                                                                                                Connection granted (SYN=1 server_isn

                                                                                                                                ACK (SYN=0 seq=client_isn+1)

                                                                                                                                ack=client_isn+1)

                                                                                                                                ack=server_isn+1

                                                                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                TCP Connection Management (cont)

                                                                                                                                Closing a connection

                                                                                                                                client closes socketclientSocketclose()

                                                                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                client

                                                                                                                                FIN

                                                                                                                                server

                                                                                                                                ACK

                                                                                                                                ACK

                                                                                                                                FIN

                                                                                                                                close

                                                                                                                                close

                                                                                                                                closed

                                                                                                                                tim

                                                                                                                                ed w

                                                                                                                                ait

                                                                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                TCP Connection Management (cont)

                                                                                                                                Step 3 client receives FIN replies with ACK

                                                                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                Closes down after timed-wait

                                                                                                                                Step 4 server receives ACK Connection closed

                                                                                                                                Note with small modification can handle simultaneous FINs

                                                                                                                                client

                                                                                                                                FIN

                                                                                                                                server

                                                                                                                                ACK

                                                                                                                                ACK

                                                                                                                                FIN

                                                                                                                                closing

                                                                                                                                closing

                                                                                                                                closed

                                                                                                                                tim

                                                                                                                                ed w

                                                                                                                                ait

                                                                                                                                closed

                                                                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                TCP Connection Management (cont)

                                                                                                                                ExampleTCP serverlifecycle

                                                                                                                                Example TCP clientlifecycle

                                                                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                A few special cases

                                                                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                Chapter 3 outline

                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                Principles of Congestion Control

                                                                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                a top-10 problem

                                                                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                large delays when congestedmaximum achievable throughput

                                                                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                Causescosts of congestion scenario 2

                                                                                                                                one router finite buffers sender retransmission of lost packet

                                                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                λin λout=

                                                                                                                                λin λoutgtλ

                                                                                                                                inλout

                                                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                (c)(a) (b)

                                                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                λin

                                                                                                                                Q what happens as and increase λ

                                                                                                                                in

                                                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                Causescosts of congestion scenario 3

                                                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                Approaches towards congestion control

                                                                                                                                Two broad approaches towards congestion control

                                                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                                                small exception ndash see next page

                                                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                sender should use available bandwidth

                                                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                Chapter 3 outline

                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                Congwin

                                                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                                                throughput = w MSSRTT Bytessec

                                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                cut CongWin in half after loss event

                                                                                                                                8 Kbytes

                                                                                                                                16 Kbytes

                                                                                                                                24 Kbytes

                                                                                                                                time

                                                                                                                                congestionwindow

                                                                                                                                Long-lived TCP connection

                                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                TCP Slow Start

                                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                TCP Slow Start (more)

                                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                Host A

                                                                                                                                one segment

                                                                                                                                RTT

                                                                                                                                Host B

                                                                                                                                time

                                                                                                                                two segments

                                                                                                                                four segments

                                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                Summary TCP Congestion Control

                                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                The Big Picture

                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                Slow Start (SS)

                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                Enter slow start

                                                                                                                                Duplicate ACK

                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                CongWin and Threshold not changed

                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                TCP throughput

                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                TCP Futures

                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                LRTTMSSsdot221

                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                TCP connection 1

                                                                                                                                bottleneckrouter

                                                                                                                                capacity R

                                                                                                                                TCP connection 2

                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                R

                                                                                                                                R

                                                                                                                                equal bandwidth share

                                                                                                                                Connection 1 throughput

                                                                                                                                Conn

                                                                                                                                ecti

                                                                                                                                on 2

                                                                                                                                thr

                                                                                                                                ough

                                                                                                                                p ut

                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                modeling slow start

                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                Fixed congestion window (1)

                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                latency = 2RTT + OR

                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                Fixed congestion window (2)

                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                Will show that the delay for one object is

                                                                                                                                RS

                                                                                                                                RSRTTP

                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                RTT

                                                                                                                                initiate TCPconnection

                                                                                                                                requestobject

                                                                                                                                first window= SR

                                                                                                                                second window= 2SR

                                                                                                                                third window= 4SR

                                                                                                                                fourth window= 8SR

                                                                                                                                completetransmissionobject

                                                                                                                                delivered

                                                                                                                                time atclient

                                                                                                                                time atserver

                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                Server idles P=2 times

                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                RS

                                                                                                                                RSRTTPRTT

                                                                                                                                RO

                                                                                                                                RSRTT

                                                                                                                                RSRTT

                                                                                                                                RO

                                                                                                                                idleTimeRTTRO

                                                                                                                                P

                                                                                                                                kP

                                                                                                                                k

                                                                                                                                P

                                                                                                                                pp

                                                                                                                                )12(][2

                                                                                                                                ]2[2

                                                                                                                                2delay

                                                                                                                                1

                                                                                                                                1

                                                                                                                                1

                                                                                                                                minusminus+++=

                                                                                                                                minus+++=

                                                                                                                                ++=

                                                                                                                                minus

                                                                                                                                =

                                                                                                                                =

                                                                                                                                sum

                                                                                                                                sum

                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                RS k =⎥⎦

                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                +minus

                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                RSk

                                                                                                                                RTT

                                                                                                                                initiate TCPconnection

                                                                                                                                requestobject

                                                                                                                                first window= SR

                                                                                                                                second window= 2SR

                                                                                                                                third window= 4SR

                                                                                                                                fourth window= 8SR

                                                                                                                                completetransmissionobject

                                                                                                                                delivered

                                                                                                                                time atclient

                                                                                                                                time atserver

                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                How do we calculate K

                                                                                                                                ⎥⎥⎤

                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                +ge=

                                                                                                                                geminus=

                                                                                                                                ge+++=

                                                                                                                                ge+++=minus

                                                                                                                                minus

                                                                                                                                )1(log

                                                                                                                                )1(logmin

                                                                                                                                12min

                                                                                                                                222min222min

                                                                                                                                2

                                                                                                                                2

                                                                                                                                110

                                                                                                                                110

                                                                                                                                SO

                                                                                                                                SOkk

                                                                                                                                SOk

                                                                                                                                SOkOSSSkK

                                                                                                                                k

                                                                                                                                k

                                                                                                                                k

                                                                                                                                L

                                                                                                                                L

                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                02468

                                                                                                                                101214161820

                                                                                                                                28Kbps

                                                                                                                                100Kbps

                                                                                                                                1 Mbps 10Mbps

                                                                                                                                non-persistent

                                                                                                                                persistent

                                                                                                                                parallel non-persistent

                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                0

                                                                                                                                10

                                                                                                                                20

                                                                                                                                30

                                                                                                                                40

                                                                                                                                50

                                                                                                                                60

                                                                                                                                70

                                                                                                                                28Kbps

                                                                                                                                100Kbps

                                                                                                                                1 Mbps 10Mbps

                                                                                                                                non-persistent

                                                                                                                                persistent

                                                                                                                                parallel non-persistent

                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                UDPTCP

                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                • Chapter 3 outline
                                                                                                                                • Transport services and protocols
                                                                                                                                • Transport vs network layer
                                                                                                                                • Transport-layer protocols
                                                                                                                                • Chapter 3 outline
                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                • How demultiplexing works
                                                                                                                                • Connectionless demultiplexing
                                                                                                                                • Connectionless demux (cont)
                                                                                                                                • Connection-oriented demux
                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                • Chapter 3 outline
                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                • UDP more
                                                                                                                                • UDP checksum
                                                                                                                                • Chapter 3 outline
                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                • Reliable data transfer getting started
                                                                                                                                • Reliable data transfer getting started
                                                                                                                                • Incremental Improvements
                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                • rdt20 FSM specification
                                                                                                                                • rdt20 operation with no errors
                                                                                                                                • rdt20 error scenario
                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                • rdt21 discussion
                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                • rdt30 sender
                                                                                                                                • rdt30 in action
                                                                                                                                • rdt30 in action
                                                                                                                                • Performance of rdt30
                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                • Pipelined protocols
                                                                                                                                • Pipelined protocols
                                                                                                                                • Pipelining increased utilization
                                                                                                                                • Go-Back-N
                                                                                                                                • GBN Sender
                                                                                                                                • GBN sender extended FSM
                                                                                                                                • GBN receiver extended FSM
                                                                                                                                • More on receiver
                                                                                                                                • GBN inaction
                                                                                                                                • Selective Repeat
                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                • Selective repeat
                                                                                                                                • Selective repeat in action
                                                                                                                                • Selective repeat dilemma
                                                                                                                                • Chapter 3 outline
                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                • More TCP Details
                                                                                                                                • Even More TCP Details
                                                                                                                                • TCP segment structure
                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                • Example RTT estimation
                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                • Chapter 3 outline
                                                                                                                                • TCP reliable data transfer
                                                                                                                                • TCP sender events
                                                                                                                                • TCP sender(simplified)
                                                                                                                                • TCP retransmission scenarios
                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                • More on Sender Policies
                                                                                                                                • Fast Retransmit
                                                                                                                                • Fast retransmit algorithm
                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                • Chapter 3 outline
                                                                                                                                • TCP Flow Control
                                                                                                                                • TCP Flow Control
                                                                                                                                • TCP segment structure
                                                                                                                                • TCP Flow control how it works
                                                                                                                                • Technical Issue
                                                                                                                                • Chapter 3 outline
                                                                                                                                • TCP Connection Management
                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                • A few special cases
                                                                                                                                • Chapter 3 outline
                                                                                                                                • Principles of Congestion Control
                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                • Approaches towards congestion control
                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                • Chapter 3 outline
                                                                                                                                • TCP Congestion Control
                                                                                                                                • TCP AIMD
                                                                                                                                • TCP Slow Start
                                                                                                                                • TCP Slow Start (more)
                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                • The Big Picture
                                                                                                                                • TCP sender congestion control
                                                                                                                                • TCP throughput
                                                                                                                                • TCP Futures
                                                                                                                                • TCP Fairness
                                                                                                                                • Why is TCP fair
                                                                                                                                • Fairness (more)
                                                                                                                                • TCP Latency Modeling
                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                • Fixed congestion window (1)
                                                                                                                                • Fixed congestion window (2)
                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                • HTTP Modeling
                                                                                                                                • Chapter 3 Summary

                                                                                                                                  3 Transport Layer 65Comp 361 Spring 2005

                                                                                                                                  Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr

                                                                                                                                  100

                                                                                                                                  150

                                                                                                                                  200

                                                                                                                                  250

                                                                                                                                  300

                                                                                                                                  350

                                                                                                                                  1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

                                                                                                                                  time (seconnds)

                                                                                                                                  RTT

                                                                                                                                  (mill

                                                                                                                                  iseco

                                                                                                                                  nds)

                                                                                                                                  SampleRTT Estimated RTT

                                                                                                                                  3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                                  TCP Round Trip Time and Timeout

                                                                                                                                  Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                                  large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                                  DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                                  (typically β = 025)

                                                                                                                                  Then set timeout interval

                                                                                                                                  TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                                  3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                                  Chapter 3 outline

                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                  3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                                  TCP reliable data transfer

                                                                                                                                  TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                                  Retransmissions are triggered by

                                                                                                                                  timeout eventsduplicate acks

                                                                                                                                  Initially consider simplified TCP sender

                                                                                                                                  ignore duplicate acksignore flow control congestion control

                                                                                                                                  3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                                  TCP sender eventsdata rcvd from app

                                                                                                                                  Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                                  timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                                  Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                                  update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                                  TCP sender(simplified)

                                                                                                                                  NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                  loop (forever) switch(event)

                                                                                                                                  event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                  start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                  event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                  smallest sequence numberstart timer

                                                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                  start timer

                                                                                                                                  end of loop forever

                                                                                                                                  Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                  3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                  3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                  TCP retransmission scenariosHost A

                                                                                                                                  Seq=100 20 bytes data

                                                                                                                                  ACK=100

                                                                                                                                  timepremature timeout

                                                                                                                                  Host B

                                                                                                                                  Seq=92 8 bytes data

                                                                                                                                  ACK=120

                                                                                                                                  Seq=92 8 bytes data

                                                                                                                                  Seq=

                                                                                                                                  92 t

                                                                                                                                  imeo

                                                                                                                                  ut

                                                                                                                                  ACK=120

                                                                                                                                  Host A

                                                                                                                                  Seq=92 8 bytes data

                                                                                                                                  ACK=100

                                                                                                                                  loss

                                                                                                                                  tim

                                                                                                                                  eout

                                                                                                                                  lost ACK scenario

                                                                                                                                  Host B

                                                                                                                                  X

                                                                                                                                  Seq=92 8 bytes data

                                                                                                                                  ACK=100

                                                                                                                                  time

                                                                                                                                  SendBase= 120

                                                                                                                                  SendBase= 120

                                                                                                                                  Sendbase= 100

                                                                                                                                  Seq=

                                                                                                                                  92 t

                                                                                                                                  imeo

                                                                                                                                  utSendBase

                                                                                                                                  = 100

                                                                                                                                  3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                  TCP retransmission scenarios (more)Host A

                                                                                                                                  Seq=92 8 bytes data

                                                                                                                                  ACK=100

                                                                                                                                  loss

                                                                                                                                  tim

                                                                                                                                  eout

                                                                                                                                  Cumulative ACK scenario

                                                                                                                                  Host B

                                                                                                                                  X

                                                                                                                                  Seq=100 20 bytes data

                                                                                                                                  ACK=120

                                                                                                                                  time

                                                                                                                                  SendBase= 120

                                                                                                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                  Event at Receiver

                                                                                                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                  Arrival of segment that partially or completely fills gap

                                                                                                                                  TCP Receiver action

                                                                                                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                  More on Sender Policies

                                                                                                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                  Fast Retransmit

                                                                                                                                  Time-out period often relatively long

                                                                                                                                  long delay before resending lost packet

                                                                                                                                  Detect lost segments via duplicate ACKs

                                                                                                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                  fast retransmit resend segment before timer expires

                                                                                                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                  Fast retransmit algorithm

                                                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                  start timer

                                                                                                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                  resend segment with sequence number y

                                                                                                                                  a duplicate ACK for already ACKed segment

                                                                                                                                  fast retransmit

                                                                                                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                  TCP GBN or Selective Repeat

                                                                                                                                  Basic TCP looks a lot like GBN

                                                                                                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                  This looks a lot like Selective Repeat

                                                                                                                                  TCP is a hybrid

                                                                                                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                  Chapter 3 outline

                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                  TCP Flow Control

                                                                                                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                  transmitting too muchtoo fast

                                                                                                                                  flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                  app process may be slow at reading from buffer

                                                                                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                  TCP segment structure

                                                                                                                                  source port dest port

                                                                                                                                  32 bits

                                                                                                                                  applicationdata

                                                                                                                                  (variable length)

                                                                                                                                  sequence numberacknowledgement number

                                                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                                                  FSRPAUheadlen

                                                                                                                                  notused

                                                                                                                                  Options (variable length)

                                                                                                                                  URG urgent data (generally not used)

                                                                                                                                  ACK ACK valid

                                                                                                                                  PSH push data now(generally not used)

                                                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                                                  commands)

                                                                                                                                  bytes rcvr willingto accept

                                                                                                                                  Internetchecksum

                                                                                                                                  (as in UDP)

                                                                                                                                  countingby bytes of data(not segments)

                                                                                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                  TCP Flow control how it works

                                                                                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                  LastByteRead]

                                                                                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                  guarantees receive buffer doesnrsquot overflow

                                                                                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                  Technical Issue

                                                                                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                  Note on UDP

                                                                                                                                  UDP has no flow control

                                                                                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                  Chapter 3 outline

                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                  TCP Connection Management

                                                                                                                                  Three way handshakeStep 1 client end system sends

                                                                                                                                  TCP SYN control segment to server

                                                                                                                                  specifies client_isn the initial seq No application data

                                                                                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                  Allocate buffersAllocates buffersCan include application data

                                                                                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                  server

                                                                                                                                  Connection granted (SYN=1 server_isn

                                                                                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                                                                                  ack=client_isn+1)

                                                                                                                                  ack=server_isn+1

                                                                                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                  Closing a connection

                                                                                                                                  client closes socketclientSocketclose()

                                                                                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                  client

                                                                                                                                  FIN

                                                                                                                                  server

                                                                                                                                  ACK

                                                                                                                                  ACK

                                                                                                                                  FIN

                                                                                                                                  close

                                                                                                                                  close

                                                                                                                                  closed

                                                                                                                                  tim

                                                                                                                                  ed w

                                                                                                                                  ait

                                                                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                  Step 3 client receives FIN replies with ACK

                                                                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                  Closes down after timed-wait

                                                                                                                                  Step 4 server receives ACK Connection closed

                                                                                                                                  Note with small modification can handle simultaneous FINs

                                                                                                                                  client

                                                                                                                                  FIN

                                                                                                                                  server

                                                                                                                                  ACK

                                                                                                                                  ACK

                                                                                                                                  FIN

                                                                                                                                  closing

                                                                                                                                  closing

                                                                                                                                  closed

                                                                                                                                  tim

                                                                                                                                  ed w

                                                                                                                                  ait

                                                                                                                                  closed

                                                                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                  ExampleTCP serverlifecycle

                                                                                                                                  Example TCP clientlifecycle

                                                                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                  A few special cases

                                                                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                  Chapter 3 outline

                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                  Principles of Congestion Control

                                                                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                  a top-10 problem

                                                                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                  large delays when congestedmaximum achievable throughput

                                                                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                  Causescosts of congestion scenario 2

                                                                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                  λin λout=

                                                                                                                                  λin λoutgtλ

                                                                                                                                  inλout

                                                                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                  (c)(a) (b)

                                                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                  λin

                                                                                                                                  Q what happens as and increase λ

                                                                                                                                  in

                                                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                  Causescosts of congestion scenario 3

                                                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                  Approaches towards congestion control

                                                                                                                                  Two broad approaches towards congestion control

                                                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                                                  small exception ndash see next page

                                                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                  sender should use available bandwidth

                                                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                  Chapter 3 outline

                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                  Congwin

                                                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                  cut CongWin in half after loss event

                                                                                                                                  8 Kbytes

                                                                                                                                  16 Kbytes

                                                                                                                                  24 Kbytes

                                                                                                                                  time

                                                                                                                                  congestionwindow

                                                                                                                                  Long-lived TCP connection

                                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                  TCP Slow Start

                                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                  TCP Slow Start (more)

                                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                  Host A

                                                                                                                                  one segment

                                                                                                                                  RTT

                                                                                                                                  Host B

                                                                                                                                  time

                                                                                                                                  two segments

                                                                                                                                  four segments

                                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                  Summary TCP Congestion Control

                                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                  The Big Picture

                                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                  Slow Start (SS)

                                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                  CongestionAvoidance (CA)

                                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                  Enter slow start

                                                                                                                                  Duplicate ACK

                                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                  CongWin and Threshold not changed

                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                  TCP throughput

                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                  TCP Futures

                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                  LRTTMSSsdot221

                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                  TCP connection 1

                                                                                                                                  bottleneckrouter

                                                                                                                                  capacity R

                                                                                                                                  TCP connection 2

                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                  R

                                                                                                                                  R

                                                                                                                                  equal bandwidth share

                                                                                                                                  Connection 1 throughput

                                                                                                                                  Conn

                                                                                                                                  ecti

                                                                                                                                  on 2

                                                                                                                                  thr

                                                                                                                                  ough

                                                                                                                                  p ut

                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                  modeling slow start

                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                  Fixed congestion window (1)

                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                  latency = 2RTT + OR

                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                  Fixed congestion window (2)

                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                  Will show that the delay for one object is

                                                                                                                                  RS

                                                                                                                                  RSRTTP

                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                  RTT

                                                                                                                                  initiate TCPconnection

                                                                                                                                  requestobject

                                                                                                                                  first window= SR

                                                                                                                                  second window= 2SR

                                                                                                                                  third window= 4SR

                                                                                                                                  fourth window= 8SR

                                                                                                                                  completetransmissionobject

                                                                                                                                  delivered

                                                                                                                                  time atclient

                                                                                                                                  time atserver

                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                  Server idles P=2 times

                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                  RS

                                                                                                                                  RSRTTPRTT

                                                                                                                                  RO

                                                                                                                                  RSRTT

                                                                                                                                  RSRTT

                                                                                                                                  RO

                                                                                                                                  idleTimeRTTRO

                                                                                                                                  P

                                                                                                                                  kP

                                                                                                                                  k

                                                                                                                                  P

                                                                                                                                  pp

                                                                                                                                  )12(][2

                                                                                                                                  ]2[2

                                                                                                                                  2delay

                                                                                                                                  1

                                                                                                                                  1

                                                                                                                                  1

                                                                                                                                  minusminus+++=

                                                                                                                                  minus+++=

                                                                                                                                  ++=

                                                                                                                                  minus

                                                                                                                                  =

                                                                                                                                  =

                                                                                                                                  sum

                                                                                                                                  sum

                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                  RS k =⎥⎦

                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                  +minus

                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                  RSk

                                                                                                                                  RTT

                                                                                                                                  initiate TCPconnection

                                                                                                                                  requestobject

                                                                                                                                  first window= SR

                                                                                                                                  second window= 2SR

                                                                                                                                  third window= 4SR

                                                                                                                                  fourth window= 8SR

                                                                                                                                  completetransmissionobject

                                                                                                                                  delivered

                                                                                                                                  time atclient

                                                                                                                                  time atserver

                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                  How do we calculate K

                                                                                                                                  ⎥⎥⎤

                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                  +ge=

                                                                                                                                  geminus=

                                                                                                                                  ge+++=

                                                                                                                                  ge+++=minus

                                                                                                                                  minus

                                                                                                                                  )1(log

                                                                                                                                  )1(logmin

                                                                                                                                  12min

                                                                                                                                  222min222min

                                                                                                                                  2

                                                                                                                                  2

                                                                                                                                  110

                                                                                                                                  110

                                                                                                                                  SO

                                                                                                                                  SOkk

                                                                                                                                  SOk

                                                                                                                                  SOkOSSSkK

                                                                                                                                  k

                                                                                                                                  k

                                                                                                                                  k

                                                                                                                                  L

                                                                                                                                  L

                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                  02468

                                                                                                                                  101214161820

                                                                                                                                  28Kbps

                                                                                                                                  100Kbps

                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                  non-persistent

                                                                                                                                  persistent

                                                                                                                                  parallel non-persistent

                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                  0

                                                                                                                                  10

                                                                                                                                  20

                                                                                                                                  30

                                                                                                                                  40

                                                                                                                                  50

                                                                                                                                  60

                                                                                                                                  70

                                                                                                                                  28Kbps

                                                                                                                                  100Kbps

                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                  non-persistent

                                                                                                                                  persistent

                                                                                                                                  parallel non-persistent

                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                  UDPTCP

                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • Transport services and protocols
                                                                                                                                  • Transport vs network layer
                                                                                                                                  • Transport-layer protocols
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                  • How demultiplexing works
                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                  • Connection-oriented demux
                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                  • UDP more
                                                                                                                                  • UDP checksum
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                  • Incremental Improvements
                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                  • rdt20 FSM specification
                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                  • rdt20 error scenario
                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                  • rdt21 discussion
                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                  • rdt30 sender
                                                                                                                                  • rdt30 in action
                                                                                                                                  • rdt30 in action
                                                                                                                                  • Performance of rdt30
                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                  • Pipelined protocols
                                                                                                                                  • Pipelined protocols
                                                                                                                                  • Pipelining increased utilization
                                                                                                                                  • Go-Back-N
                                                                                                                                  • GBN Sender
                                                                                                                                  • GBN sender extended FSM
                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                  • More on receiver
                                                                                                                                  • GBN inaction
                                                                                                                                  • Selective Repeat
                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                  • Selective repeat
                                                                                                                                  • Selective repeat in action
                                                                                                                                  • Selective repeat dilemma
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                  • More TCP Details
                                                                                                                                  • Even More TCP Details
                                                                                                                                  • TCP segment structure
                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                  • Example RTT estimation
                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • TCP reliable data transfer
                                                                                                                                  • TCP sender events
                                                                                                                                  • TCP sender(simplified)
                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                  • More on Sender Policies
                                                                                                                                  • Fast Retransmit
                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • TCP Flow Control
                                                                                                                                  • TCP Flow Control
                                                                                                                                  • TCP segment structure
                                                                                                                                  • TCP Flow control how it works
                                                                                                                                  • Technical Issue
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • TCP Connection Management
                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                  • A few special cases
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • Principles of Congestion Control
                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                  • Approaches towards congestion control
                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                  • Chapter 3 outline
                                                                                                                                  • TCP Congestion Control
                                                                                                                                  • TCP AIMD
                                                                                                                                  • TCP Slow Start
                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                  • The Big Picture
                                                                                                                                  • TCP sender congestion control
                                                                                                                                  • TCP throughput
                                                                                                                                  • TCP Futures
                                                                                                                                  • TCP Fairness
                                                                                                                                  • Why is TCP fair
                                                                                                                                  • Fairness (more)
                                                                                                                                  • TCP Latency Modeling
                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                  • HTTP Modeling
                                                                                                                                  • Chapter 3 Summary

                                                                                                                                    3 Transport Layer 66Comp 361 Spring 2005

                                                                                                                                    TCP Round Trip Time and Timeout

                                                                                                                                    Setting the timeoutEstimtedRTT plus ldquosafety marginrdquo

                                                                                                                                    large variation in EstimatedRTT -gt larger safety marginfirst estimate of how much SampleRTT deviates from EstimatedRTT

                                                                                                                                    DevRTT = (1-β)DevRTT +β|SampleRTT-EstimatedRTT|

                                                                                                                                    (typically β = 025)

                                                                                                                                    Then set timeout interval

                                                                                                                                    TimeoutInterval = EstimatedRTT + 4DevRTT

                                                                                                                                    3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                                    Chapter 3 outline

                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                    3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                                    TCP reliable data transfer

                                                                                                                                    TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                                    Retransmissions are triggered by

                                                                                                                                    timeout eventsduplicate acks

                                                                                                                                    Initially consider simplified TCP sender

                                                                                                                                    ignore duplicate acksignore flow control congestion control

                                                                                                                                    3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                                    TCP sender eventsdata rcvd from app

                                                                                                                                    Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                                    timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                                    Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                                    update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                                    TCP sender(simplified)

                                                                                                                                    NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                    loop (forever) switch(event)

                                                                                                                                    event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                    start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                    event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                    smallest sequence numberstart timer

                                                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                    start timer

                                                                                                                                    end of loop forever

                                                                                                                                    Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                    3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                    3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                    TCP retransmission scenariosHost A

                                                                                                                                    Seq=100 20 bytes data

                                                                                                                                    ACK=100

                                                                                                                                    timepremature timeout

                                                                                                                                    Host B

                                                                                                                                    Seq=92 8 bytes data

                                                                                                                                    ACK=120

                                                                                                                                    Seq=92 8 bytes data

                                                                                                                                    Seq=

                                                                                                                                    92 t

                                                                                                                                    imeo

                                                                                                                                    ut

                                                                                                                                    ACK=120

                                                                                                                                    Host A

                                                                                                                                    Seq=92 8 bytes data

                                                                                                                                    ACK=100

                                                                                                                                    loss

                                                                                                                                    tim

                                                                                                                                    eout

                                                                                                                                    lost ACK scenario

                                                                                                                                    Host B

                                                                                                                                    X

                                                                                                                                    Seq=92 8 bytes data

                                                                                                                                    ACK=100

                                                                                                                                    time

                                                                                                                                    SendBase= 120

                                                                                                                                    SendBase= 120

                                                                                                                                    Sendbase= 100

                                                                                                                                    Seq=

                                                                                                                                    92 t

                                                                                                                                    imeo

                                                                                                                                    utSendBase

                                                                                                                                    = 100

                                                                                                                                    3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                    TCP retransmission scenarios (more)Host A

                                                                                                                                    Seq=92 8 bytes data

                                                                                                                                    ACK=100

                                                                                                                                    loss

                                                                                                                                    tim

                                                                                                                                    eout

                                                                                                                                    Cumulative ACK scenario

                                                                                                                                    Host B

                                                                                                                                    X

                                                                                                                                    Seq=100 20 bytes data

                                                                                                                                    ACK=120

                                                                                                                                    time

                                                                                                                                    SendBase= 120

                                                                                                                                    3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                    TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                    Event at Receiver

                                                                                                                                    Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                    Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                    Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                    Arrival of segment that partially or completely fills gap

                                                                                                                                    TCP Receiver action

                                                                                                                                    Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                    Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                    Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                    Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                    More on Sender Policies

                                                                                                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                    Fast Retransmit

                                                                                                                                    Time-out period often relatively long

                                                                                                                                    long delay before resending lost packet

                                                                                                                                    Detect lost segments via duplicate ACKs

                                                                                                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                    fast retransmit resend segment before timer expires

                                                                                                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                    Fast retransmit algorithm

                                                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                    start timer

                                                                                                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                    resend segment with sequence number y

                                                                                                                                    a duplicate ACK for already ACKed segment

                                                                                                                                    fast retransmit

                                                                                                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                    TCP GBN or Selective Repeat

                                                                                                                                    Basic TCP looks a lot like GBN

                                                                                                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                    This looks a lot like Selective Repeat

                                                                                                                                    TCP is a hybrid

                                                                                                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                    Chapter 3 outline

                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                    TCP Flow Control

                                                                                                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                    transmitting too muchtoo fast

                                                                                                                                    flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                    app process may be slow at reading from buffer

                                                                                                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                    TCP segment structure

                                                                                                                                    source port dest port

                                                                                                                                    32 bits

                                                                                                                                    applicationdata

                                                                                                                                    (variable length)

                                                                                                                                    sequence numberacknowledgement number

                                                                                                                                    Receive windowUrg data pnterchecksum

                                                                                                                                    FSRPAUheadlen

                                                                                                                                    notused

                                                                                                                                    Options (variable length)

                                                                                                                                    URG urgent data (generally not used)

                                                                                                                                    ACK ACK valid

                                                                                                                                    PSH push data now(generally not used)

                                                                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                                                                    commands)

                                                                                                                                    bytes rcvr willingto accept

                                                                                                                                    Internetchecksum

                                                                                                                                    (as in UDP)

                                                                                                                                    countingby bytes of data(not segments)

                                                                                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                    TCP Flow control how it works

                                                                                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                    LastByteRead]

                                                                                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                    guarantees receive buffer doesnrsquot overflow

                                                                                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                    Technical Issue

                                                                                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                    Note on UDP

                                                                                                                                    UDP has no flow control

                                                                                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                    Chapter 3 outline

                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                    TCP Connection Management

                                                                                                                                    Three way handshakeStep 1 client end system sends

                                                                                                                                    TCP SYN control segment to server

                                                                                                                                    specifies client_isn the initial seq No application data

                                                                                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                    Allocate buffersAllocates buffersCan include application data

                                                                                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                    server

                                                                                                                                    Connection granted (SYN=1 server_isn

                                                                                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                                                                                    ack=client_isn+1)

                                                                                                                                    ack=server_isn+1

                                                                                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                    Closing a connection

                                                                                                                                    client closes socketclientSocketclose()

                                                                                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                    client

                                                                                                                                    FIN

                                                                                                                                    server

                                                                                                                                    ACK

                                                                                                                                    ACK

                                                                                                                                    FIN

                                                                                                                                    close

                                                                                                                                    close

                                                                                                                                    closed

                                                                                                                                    tim

                                                                                                                                    ed w

                                                                                                                                    ait

                                                                                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                    Step 3 client receives FIN replies with ACK

                                                                                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                    Closes down after timed-wait

                                                                                                                                    Step 4 server receives ACK Connection closed

                                                                                                                                    Note with small modification can handle simultaneous FINs

                                                                                                                                    client

                                                                                                                                    FIN

                                                                                                                                    server

                                                                                                                                    ACK

                                                                                                                                    ACK

                                                                                                                                    FIN

                                                                                                                                    closing

                                                                                                                                    closing

                                                                                                                                    closed

                                                                                                                                    tim

                                                                                                                                    ed w

                                                                                                                                    ait

                                                                                                                                    closed

                                                                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                    ExampleTCP serverlifecycle

                                                                                                                                    Example TCP clientlifecycle

                                                                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                    A few special cases

                                                                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                    Chapter 3 outline

                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                    Principles of Congestion Control

                                                                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                    a top-10 problem

                                                                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                    large delays when congestedmaximum achievable throughput

                                                                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                    Causescosts of congestion scenario 2

                                                                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                    λin λout=

                                                                                                                                    λin λoutgtλ

                                                                                                                                    inλout

                                                                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                    (c)(a) (b)

                                                                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                    λin

                                                                                                                                    Q what happens as and increase λ

                                                                                                                                    in

                                                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                    Causescosts of congestion scenario 3

                                                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                    Approaches towards congestion control

                                                                                                                                    Two broad approaches towards congestion control

                                                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                                                    small exception ndash see next page

                                                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                    sender should use available bandwidth

                                                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                    Chapter 3 outline

                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                    Congwin

                                                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                    cut CongWin in half after loss event

                                                                                                                                    8 Kbytes

                                                                                                                                    16 Kbytes

                                                                                                                                    24 Kbytes

                                                                                                                                    time

                                                                                                                                    congestionwindow

                                                                                                                                    Long-lived TCP connection

                                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                    TCP Slow Start

                                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                    TCP Slow Start (more)

                                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                    Host A

                                                                                                                                    one segment

                                                                                                                                    RTT

                                                                                                                                    Host B

                                                                                                                                    time

                                                                                                                                    two segments

                                                                                                                                    four segments

                                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                    Summary TCP Congestion Control

                                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                    The Big Picture

                                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                    Slow Start (SS)

                                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                    CongestionAvoidance (CA)

                                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                    Enter slow start

                                                                                                                                    Duplicate ACK

                                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                    CongWin and Threshold not changed

                                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                    TCP throughput

                                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                    TCP Futures

                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                    LRTTMSSsdot221

                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                    TCP connection 1

                                                                                                                                    bottleneckrouter

                                                                                                                                    capacity R

                                                                                                                                    TCP connection 2

                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                    R

                                                                                                                                    R

                                                                                                                                    equal bandwidth share

                                                                                                                                    Connection 1 throughput

                                                                                                                                    Conn

                                                                                                                                    ecti

                                                                                                                                    on 2

                                                                                                                                    thr

                                                                                                                                    ough

                                                                                                                                    p ut

                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                    modeling slow start

                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                    Fixed congestion window (1)

                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                    latency = 2RTT + OR

                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                    Fixed congestion window (2)

                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                    Will show that the delay for one object is

                                                                                                                                    RS

                                                                                                                                    RSRTTP

                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                    RTT

                                                                                                                                    initiate TCPconnection

                                                                                                                                    requestobject

                                                                                                                                    first window= SR

                                                                                                                                    second window= 2SR

                                                                                                                                    third window= 4SR

                                                                                                                                    fourth window= 8SR

                                                                                                                                    completetransmissionobject

                                                                                                                                    delivered

                                                                                                                                    time atclient

                                                                                                                                    time atserver

                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                    Server idles P=2 times

                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                    RS

                                                                                                                                    RSRTTPRTT

                                                                                                                                    RO

                                                                                                                                    RSRTT

                                                                                                                                    RSRTT

                                                                                                                                    RO

                                                                                                                                    idleTimeRTTRO

                                                                                                                                    P

                                                                                                                                    kP

                                                                                                                                    k

                                                                                                                                    P

                                                                                                                                    pp

                                                                                                                                    )12(][2

                                                                                                                                    ]2[2

                                                                                                                                    2delay

                                                                                                                                    1

                                                                                                                                    1

                                                                                                                                    1

                                                                                                                                    minusminus+++=

                                                                                                                                    minus+++=

                                                                                                                                    ++=

                                                                                                                                    minus

                                                                                                                                    =

                                                                                                                                    =

                                                                                                                                    sum

                                                                                                                                    sum

                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                    RS k =⎥⎦

                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                    +minus

                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                    RSk

                                                                                                                                    RTT

                                                                                                                                    initiate TCPconnection

                                                                                                                                    requestobject

                                                                                                                                    first window= SR

                                                                                                                                    second window= 2SR

                                                                                                                                    third window= 4SR

                                                                                                                                    fourth window= 8SR

                                                                                                                                    completetransmissionobject

                                                                                                                                    delivered

                                                                                                                                    time atclient

                                                                                                                                    time atserver

                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                    How do we calculate K

                                                                                                                                    ⎥⎥⎤

                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                    +ge=

                                                                                                                                    geminus=

                                                                                                                                    ge+++=

                                                                                                                                    ge+++=minus

                                                                                                                                    minus

                                                                                                                                    )1(log

                                                                                                                                    )1(logmin

                                                                                                                                    12min

                                                                                                                                    222min222min

                                                                                                                                    2

                                                                                                                                    2

                                                                                                                                    110

                                                                                                                                    110

                                                                                                                                    SO

                                                                                                                                    SOkk

                                                                                                                                    SOk

                                                                                                                                    SOkOSSSkK

                                                                                                                                    k

                                                                                                                                    k

                                                                                                                                    k

                                                                                                                                    L

                                                                                                                                    L

                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                    02468

                                                                                                                                    101214161820

                                                                                                                                    28Kbps

                                                                                                                                    100Kbps

                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                    non-persistent

                                                                                                                                    persistent

                                                                                                                                    parallel non-persistent

                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                    0

                                                                                                                                    10

                                                                                                                                    20

                                                                                                                                    30

                                                                                                                                    40

                                                                                                                                    50

                                                                                                                                    60

                                                                                                                                    70

                                                                                                                                    28Kbps

                                                                                                                                    100Kbps

                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                    non-persistent

                                                                                                                                    persistent

                                                                                                                                    parallel non-persistent

                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                    UDPTCP

                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • Transport services and protocols
                                                                                                                                    • Transport vs network layer
                                                                                                                                    • Transport-layer protocols
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                    • How demultiplexing works
                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                    • Connection-oriented demux
                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                    • UDP more
                                                                                                                                    • UDP checksum
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                    • Incremental Improvements
                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                    • rdt20 FSM specification
                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                    • rdt20 error scenario
                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                    • rdt21 discussion
                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                    • rdt30 sender
                                                                                                                                    • rdt30 in action
                                                                                                                                    • rdt30 in action
                                                                                                                                    • Performance of rdt30
                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                    • Pipelined protocols
                                                                                                                                    • Pipelined protocols
                                                                                                                                    • Pipelining increased utilization
                                                                                                                                    • Go-Back-N
                                                                                                                                    • GBN Sender
                                                                                                                                    • GBN sender extended FSM
                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                    • More on receiver
                                                                                                                                    • GBN inaction
                                                                                                                                    • Selective Repeat
                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                    • Selective repeat
                                                                                                                                    • Selective repeat in action
                                                                                                                                    • Selective repeat dilemma
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                    • More TCP Details
                                                                                                                                    • Even More TCP Details
                                                                                                                                    • TCP segment structure
                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                    • Example RTT estimation
                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • TCP reliable data transfer
                                                                                                                                    • TCP sender events
                                                                                                                                    • TCP sender(simplified)
                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                    • More on Sender Policies
                                                                                                                                    • Fast Retransmit
                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • TCP Flow Control
                                                                                                                                    • TCP Flow Control
                                                                                                                                    • TCP segment structure
                                                                                                                                    • TCP Flow control how it works
                                                                                                                                    • Technical Issue
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • TCP Connection Management
                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                    • A few special cases
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • Principles of Congestion Control
                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                    • Approaches towards congestion control
                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                    • Chapter 3 outline
                                                                                                                                    • TCP Congestion Control
                                                                                                                                    • TCP AIMD
                                                                                                                                    • TCP Slow Start
                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                    • The Big Picture
                                                                                                                                    • TCP sender congestion control
                                                                                                                                    • TCP throughput
                                                                                                                                    • TCP Futures
                                                                                                                                    • TCP Fairness
                                                                                                                                    • Why is TCP fair
                                                                                                                                    • Fairness (more)
                                                                                                                                    • TCP Latency Modeling
                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                    • HTTP Modeling
                                                                                                                                    • Chapter 3 Summary

                                                                                                                                      3 Transport Layer 67Comp 361 Spring 2005

                                                                                                                                      Chapter 3 outline

                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                      3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                                      TCP reliable data transfer

                                                                                                                                      TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                                      Retransmissions are triggered by

                                                                                                                                      timeout eventsduplicate acks

                                                                                                                                      Initially consider simplified TCP sender

                                                                                                                                      ignore duplicate acksignore flow control congestion control

                                                                                                                                      3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                                      TCP sender eventsdata rcvd from app

                                                                                                                                      Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                                      timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                                      Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                                      update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                                      TCP sender(simplified)

                                                                                                                                      NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                      loop (forever) switch(event)

                                                                                                                                      event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                      start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                      event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                      smallest sequence numberstart timer

                                                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                      start timer

                                                                                                                                      end of loop forever

                                                                                                                                      Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                      3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                      3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                      TCP retransmission scenariosHost A

                                                                                                                                      Seq=100 20 bytes data

                                                                                                                                      ACK=100

                                                                                                                                      timepremature timeout

                                                                                                                                      Host B

                                                                                                                                      Seq=92 8 bytes data

                                                                                                                                      ACK=120

                                                                                                                                      Seq=92 8 bytes data

                                                                                                                                      Seq=

                                                                                                                                      92 t

                                                                                                                                      imeo

                                                                                                                                      ut

                                                                                                                                      ACK=120

                                                                                                                                      Host A

                                                                                                                                      Seq=92 8 bytes data

                                                                                                                                      ACK=100

                                                                                                                                      loss

                                                                                                                                      tim

                                                                                                                                      eout

                                                                                                                                      lost ACK scenario

                                                                                                                                      Host B

                                                                                                                                      X

                                                                                                                                      Seq=92 8 bytes data

                                                                                                                                      ACK=100

                                                                                                                                      time

                                                                                                                                      SendBase= 120

                                                                                                                                      SendBase= 120

                                                                                                                                      Sendbase= 100

                                                                                                                                      Seq=

                                                                                                                                      92 t

                                                                                                                                      imeo

                                                                                                                                      utSendBase

                                                                                                                                      = 100

                                                                                                                                      3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                      TCP retransmission scenarios (more)Host A

                                                                                                                                      Seq=92 8 bytes data

                                                                                                                                      ACK=100

                                                                                                                                      loss

                                                                                                                                      tim

                                                                                                                                      eout

                                                                                                                                      Cumulative ACK scenario

                                                                                                                                      Host B

                                                                                                                                      X

                                                                                                                                      Seq=100 20 bytes data

                                                                                                                                      ACK=120

                                                                                                                                      time

                                                                                                                                      SendBase= 120

                                                                                                                                      3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                      TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                      Event at Receiver

                                                                                                                                      Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                      Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                      Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                      Arrival of segment that partially or completely fills gap

                                                                                                                                      TCP Receiver action

                                                                                                                                      Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                      Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                      Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                      Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                      3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                      More on Sender Policies

                                                                                                                                      Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                      Fast Retransmit

                                                                                                                                      Time-out period often relatively long

                                                                                                                                      long delay before resending lost packet

                                                                                                                                      Detect lost segments via duplicate ACKs

                                                                                                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                      fast retransmit resend segment before timer expires

                                                                                                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                      Fast retransmit algorithm

                                                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                      start timer

                                                                                                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                      resend segment with sequence number y

                                                                                                                                      a duplicate ACK for already ACKed segment

                                                                                                                                      fast retransmit

                                                                                                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                      TCP GBN or Selective Repeat

                                                                                                                                      Basic TCP looks a lot like GBN

                                                                                                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                      This looks a lot like Selective Repeat

                                                                                                                                      TCP is a hybrid

                                                                                                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                      Chapter 3 outline

                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                      TCP Flow Control

                                                                                                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                      transmitting too muchtoo fast

                                                                                                                                      flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                      app process may be slow at reading from buffer

                                                                                                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                      TCP segment structure

                                                                                                                                      source port dest port

                                                                                                                                      32 bits

                                                                                                                                      applicationdata

                                                                                                                                      (variable length)

                                                                                                                                      sequence numberacknowledgement number

                                                                                                                                      Receive windowUrg data pnterchecksum

                                                                                                                                      FSRPAUheadlen

                                                                                                                                      notused

                                                                                                                                      Options (variable length)

                                                                                                                                      URG urgent data (generally not used)

                                                                                                                                      ACK ACK valid

                                                                                                                                      PSH push data now(generally not used)

                                                                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                                                                      commands)

                                                                                                                                      bytes rcvr willingto accept

                                                                                                                                      Internetchecksum

                                                                                                                                      (as in UDP)

                                                                                                                                      countingby bytes of data(not segments)

                                                                                                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                      TCP Flow control how it works

                                                                                                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                      LastByteRead]

                                                                                                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                      guarantees receive buffer doesnrsquot overflow

                                                                                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                      Technical Issue

                                                                                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                      Note on UDP

                                                                                                                                      UDP has no flow control

                                                                                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                      Chapter 3 outline

                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                      TCP Connection Management

                                                                                                                                      Three way handshakeStep 1 client end system sends

                                                                                                                                      TCP SYN control segment to server

                                                                                                                                      specifies client_isn the initial seq No application data

                                                                                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                      Allocate buffersAllocates buffersCan include application data

                                                                                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                      server

                                                                                                                                      Connection granted (SYN=1 server_isn

                                                                                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                                                                                      ack=client_isn+1)

                                                                                                                                      ack=server_isn+1

                                                                                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                      Closing a connection

                                                                                                                                      client closes socketclientSocketclose()

                                                                                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                      client

                                                                                                                                      FIN

                                                                                                                                      server

                                                                                                                                      ACK

                                                                                                                                      ACK

                                                                                                                                      FIN

                                                                                                                                      close

                                                                                                                                      close

                                                                                                                                      closed

                                                                                                                                      tim

                                                                                                                                      ed w

                                                                                                                                      ait

                                                                                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                      Step 3 client receives FIN replies with ACK

                                                                                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                      Closes down after timed-wait

                                                                                                                                      Step 4 server receives ACK Connection closed

                                                                                                                                      Note with small modification can handle simultaneous FINs

                                                                                                                                      client

                                                                                                                                      FIN

                                                                                                                                      server

                                                                                                                                      ACK

                                                                                                                                      ACK

                                                                                                                                      FIN

                                                                                                                                      closing

                                                                                                                                      closing

                                                                                                                                      closed

                                                                                                                                      tim

                                                                                                                                      ed w

                                                                                                                                      ait

                                                                                                                                      closed

                                                                                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                      ExampleTCP serverlifecycle

                                                                                                                                      Example TCP clientlifecycle

                                                                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                      A few special cases

                                                                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                      Chapter 3 outline

                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                      Principles of Congestion Control

                                                                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                      a top-10 problem

                                                                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                      large delays when congestedmaximum achievable throughput

                                                                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                      Causescosts of congestion scenario 2

                                                                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                      λin λout=

                                                                                                                                      λin λoutgtλ

                                                                                                                                      inλout

                                                                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                      (c)(a) (b)

                                                                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                      λin

                                                                                                                                      Q what happens as and increase λ

                                                                                                                                      in

                                                                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                      Causescosts of congestion scenario 3

                                                                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                      Approaches towards congestion control

                                                                                                                                      Two broad approaches towards congestion control

                                                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                                                      small exception ndash see next page

                                                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                      sender should use available bandwidth

                                                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                      Chapter 3 outline

                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                      Congwin

                                                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                      cut CongWin in half after loss event

                                                                                                                                      8 Kbytes

                                                                                                                                      16 Kbytes

                                                                                                                                      24 Kbytes

                                                                                                                                      time

                                                                                                                                      congestionwindow

                                                                                                                                      Long-lived TCP connection

                                                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                      TCP Slow Start

                                                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                      TCP Slow Start (more)

                                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                      Host A

                                                                                                                                      one segment

                                                                                                                                      RTT

                                                                                                                                      Host B

                                                                                                                                      time

                                                                                                                                      two segments

                                                                                                                                      four segments

                                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                      Summary TCP Congestion Control

                                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                      The Big Picture

                                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                      Slow Start (SS)

                                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                      CongestionAvoidance (CA)

                                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                      Enter slow start

                                                                                                                                      Duplicate ACK

                                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                      CongWin and Threshold not changed

                                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                      TCP throughput

                                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                      TCP Futures

                                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                      LRTTMSSsdot221

                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                      TCP connection 1

                                                                                                                                      bottleneckrouter

                                                                                                                                      capacity R

                                                                                                                                      TCP connection 2

                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                      R

                                                                                                                                      R

                                                                                                                                      equal bandwidth share

                                                                                                                                      Connection 1 throughput

                                                                                                                                      Conn

                                                                                                                                      ecti

                                                                                                                                      on 2

                                                                                                                                      thr

                                                                                                                                      ough

                                                                                                                                      p ut

                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                      modeling slow start

                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                      Fixed congestion window (1)

                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                      latency = 2RTT + OR

                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                      Fixed congestion window (2)

                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                      Will show that the delay for one object is

                                                                                                                                      RS

                                                                                                                                      RSRTTP

                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                      RTT

                                                                                                                                      initiate TCPconnection

                                                                                                                                      requestobject

                                                                                                                                      first window= SR

                                                                                                                                      second window= 2SR

                                                                                                                                      third window= 4SR

                                                                                                                                      fourth window= 8SR

                                                                                                                                      completetransmissionobject

                                                                                                                                      delivered

                                                                                                                                      time atclient

                                                                                                                                      time atserver

                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                      Server idles P=2 times

                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                      RS

                                                                                                                                      RSRTTPRTT

                                                                                                                                      RO

                                                                                                                                      RSRTT

                                                                                                                                      RSRTT

                                                                                                                                      RO

                                                                                                                                      idleTimeRTTRO

                                                                                                                                      P

                                                                                                                                      kP

                                                                                                                                      k

                                                                                                                                      P

                                                                                                                                      pp

                                                                                                                                      )12(][2

                                                                                                                                      ]2[2

                                                                                                                                      2delay

                                                                                                                                      1

                                                                                                                                      1

                                                                                                                                      1

                                                                                                                                      minusminus+++=

                                                                                                                                      minus+++=

                                                                                                                                      ++=

                                                                                                                                      minus

                                                                                                                                      =

                                                                                                                                      =

                                                                                                                                      sum

                                                                                                                                      sum

                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                      RS k =⎥⎦

                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                      +minus

                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                      RSk

                                                                                                                                      RTT

                                                                                                                                      initiate TCPconnection

                                                                                                                                      requestobject

                                                                                                                                      first window= SR

                                                                                                                                      second window= 2SR

                                                                                                                                      third window= 4SR

                                                                                                                                      fourth window= 8SR

                                                                                                                                      completetransmissionobject

                                                                                                                                      delivered

                                                                                                                                      time atclient

                                                                                                                                      time atserver

                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                      How do we calculate K

                                                                                                                                      ⎥⎥⎤

                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                      +ge=

                                                                                                                                      geminus=

                                                                                                                                      ge+++=

                                                                                                                                      ge+++=minus

                                                                                                                                      minus

                                                                                                                                      )1(log

                                                                                                                                      )1(logmin

                                                                                                                                      12min

                                                                                                                                      222min222min

                                                                                                                                      2

                                                                                                                                      2

                                                                                                                                      110

                                                                                                                                      110

                                                                                                                                      SO

                                                                                                                                      SOkk

                                                                                                                                      SOk

                                                                                                                                      SOkOSSSkK

                                                                                                                                      k

                                                                                                                                      k

                                                                                                                                      k

                                                                                                                                      L

                                                                                                                                      L

                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                      02468

                                                                                                                                      101214161820

                                                                                                                                      28Kbps

                                                                                                                                      100Kbps

                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                      non-persistent

                                                                                                                                      persistent

                                                                                                                                      parallel non-persistent

                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                      0

                                                                                                                                      10

                                                                                                                                      20

                                                                                                                                      30

                                                                                                                                      40

                                                                                                                                      50

                                                                                                                                      60

                                                                                                                                      70

                                                                                                                                      28Kbps

                                                                                                                                      100Kbps

                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                      non-persistent

                                                                                                                                      persistent

                                                                                                                                      parallel non-persistent

                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                      UDPTCP

                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • Transport services and protocols
                                                                                                                                      • Transport vs network layer
                                                                                                                                      • Transport-layer protocols
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                      • How demultiplexing works
                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                      • Connection-oriented demux
                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                      • UDP more
                                                                                                                                      • UDP checksum
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                      • Incremental Improvements
                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                      • rdt20 FSM specification
                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                      • rdt20 error scenario
                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                      • rdt21 discussion
                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                      • rdt30 sender
                                                                                                                                      • rdt30 in action
                                                                                                                                      • rdt30 in action
                                                                                                                                      • Performance of rdt30
                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                      • Pipelined protocols
                                                                                                                                      • Pipelined protocols
                                                                                                                                      • Pipelining increased utilization
                                                                                                                                      • Go-Back-N
                                                                                                                                      • GBN Sender
                                                                                                                                      • GBN sender extended FSM
                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                      • More on receiver
                                                                                                                                      • GBN inaction
                                                                                                                                      • Selective Repeat
                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                      • Selective repeat
                                                                                                                                      • Selective repeat in action
                                                                                                                                      • Selective repeat dilemma
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                      • More TCP Details
                                                                                                                                      • Even More TCP Details
                                                                                                                                      • TCP segment structure
                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                      • Example RTT estimation
                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • TCP reliable data transfer
                                                                                                                                      • TCP sender events
                                                                                                                                      • TCP sender(simplified)
                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                      • More on Sender Policies
                                                                                                                                      • Fast Retransmit
                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • TCP Flow Control
                                                                                                                                      • TCP Flow Control
                                                                                                                                      • TCP segment structure
                                                                                                                                      • TCP Flow control how it works
                                                                                                                                      • Technical Issue
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • TCP Connection Management
                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                      • A few special cases
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • Principles of Congestion Control
                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                      • Approaches towards congestion control
                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                      • Chapter 3 outline
                                                                                                                                      • TCP Congestion Control
                                                                                                                                      • TCP AIMD
                                                                                                                                      • TCP Slow Start
                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                      • The Big Picture
                                                                                                                                      • TCP sender congestion control
                                                                                                                                      • TCP throughput
                                                                                                                                      • TCP Futures
                                                                                                                                      • TCP Fairness
                                                                                                                                      • Why is TCP fair
                                                                                                                                      • Fairness (more)
                                                                                                                                      • TCP Latency Modeling
                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                      • HTTP Modeling
                                                                                                                                      • Chapter 3 Summary

                                                                                                                                        3 Transport Layer 68Comp 361 Spring 2005

                                                                                                                                        TCP reliable data transfer

                                                                                                                                        TCP creates rdtservice on top of IPrsquos unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timer

                                                                                                                                        Retransmissions are triggered by

                                                                                                                                        timeout eventsduplicate acks

                                                                                                                                        Initially consider simplified TCP sender

                                                                                                                                        ignore duplicate acksignore flow control congestion control

                                                                                                                                        3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                                        TCP sender eventsdata rcvd from app

                                                                                                                                        Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                                        timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                                        Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                                        update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                                        TCP sender(simplified)

                                                                                                                                        NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                        loop (forever) switch(event)

                                                                                                                                        event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                        start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                        event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                        smallest sequence numberstart timer

                                                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                        start timer

                                                                                                                                        end of loop forever

                                                                                                                                        Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                        3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                        3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                        TCP retransmission scenariosHost A

                                                                                                                                        Seq=100 20 bytes data

                                                                                                                                        ACK=100

                                                                                                                                        timepremature timeout

                                                                                                                                        Host B

                                                                                                                                        Seq=92 8 bytes data

                                                                                                                                        ACK=120

                                                                                                                                        Seq=92 8 bytes data

                                                                                                                                        Seq=

                                                                                                                                        92 t

                                                                                                                                        imeo

                                                                                                                                        ut

                                                                                                                                        ACK=120

                                                                                                                                        Host A

                                                                                                                                        Seq=92 8 bytes data

                                                                                                                                        ACK=100

                                                                                                                                        loss

                                                                                                                                        tim

                                                                                                                                        eout

                                                                                                                                        lost ACK scenario

                                                                                                                                        Host B

                                                                                                                                        X

                                                                                                                                        Seq=92 8 bytes data

                                                                                                                                        ACK=100

                                                                                                                                        time

                                                                                                                                        SendBase= 120

                                                                                                                                        SendBase= 120

                                                                                                                                        Sendbase= 100

                                                                                                                                        Seq=

                                                                                                                                        92 t

                                                                                                                                        imeo

                                                                                                                                        utSendBase

                                                                                                                                        = 100

                                                                                                                                        3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                        TCP retransmission scenarios (more)Host A

                                                                                                                                        Seq=92 8 bytes data

                                                                                                                                        ACK=100

                                                                                                                                        loss

                                                                                                                                        tim

                                                                                                                                        eout

                                                                                                                                        Cumulative ACK scenario

                                                                                                                                        Host B

                                                                                                                                        X

                                                                                                                                        Seq=100 20 bytes data

                                                                                                                                        ACK=120

                                                                                                                                        time

                                                                                                                                        SendBase= 120

                                                                                                                                        3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                        TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                        Event at Receiver

                                                                                                                                        Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                        Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                        Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                        Arrival of segment that partially or completely fills gap

                                                                                                                                        TCP Receiver action

                                                                                                                                        Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                        Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                        Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                        Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                        3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                        More on Sender Policies

                                                                                                                                        Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                        3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                        Fast Retransmit

                                                                                                                                        Time-out period often relatively long

                                                                                                                                        long delay before resending lost packet

                                                                                                                                        Detect lost segments via duplicate ACKs

                                                                                                                                        Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                        If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                        fast retransmit resend segment before timer expires

                                                                                                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                        Fast retransmit algorithm

                                                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                        start timer

                                                                                                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                        resend segment with sequence number y

                                                                                                                                        a duplicate ACK for already ACKed segment

                                                                                                                                        fast retransmit

                                                                                                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                        TCP GBN or Selective Repeat

                                                                                                                                        Basic TCP looks a lot like GBN

                                                                                                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                        This looks a lot like Selective Repeat

                                                                                                                                        TCP is a hybrid

                                                                                                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                        Chapter 3 outline

                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                        TCP Flow Control

                                                                                                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                        transmitting too muchtoo fast

                                                                                                                                        flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                        app process may be slow at reading from buffer

                                                                                                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                        TCP segment structure

                                                                                                                                        source port dest port

                                                                                                                                        32 bits

                                                                                                                                        applicationdata

                                                                                                                                        (variable length)

                                                                                                                                        sequence numberacknowledgement number

                                                                                                                                        Receive windowUrg data pnterchecksum

                                                                                                                                        FSRPAUheadlen

                                                                                                                                        notused

                                                                                                                                        Options (variable length)

                                                                                                                                        URG urgent data (generally not used)

                                                                                                                                        ACK ACK valid

                                                                                                                                        PSH push data now(generally not used)

                                                                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                                                                        commands)

                                                                                                                                        bytes rcvr willingto accept

                                                                                                                                        Internetchecksum

                                                                                                                                        (as in UDP)

                                                                                                                                        countingby bytes of data(not segments)

                                                                                                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                        TCP Flow control how it works

                                                                                                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                        LastByteRead]

                                                                                                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                        guarantees receive buffer doesnrsquot overflow

                                                                                                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                        Technical Issue

                                                                                                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                        Note on UDP

                                                                                                                                        UDP has no flow control

                                                                                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                        Chapter 3 outline

                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                        TCP Connection Management

                                                                                                                                        Three way handshakeStep 1 client end system sends

                                                                                                                                        TCP SYN control segment to server

                                                                                                                                        specifies client_isn the initial seq No application data

                                                                                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                        Allocate buffersAllocates buffersCan include application data

                                                                                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                        server

                                                                                                                                        Connection granted (SYN=1 server_isn

                                                                                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                                                                                        ack=client_isn+1)

                                                                                                                                        ack=server_isn+1

                                                                                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                        Closing a connection

                                                                                                                                        client closes socketclientSocketclose()

                                                                                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                        client

                                                                                                                                        FIN

                                                                                                                                        server

                                                                                                                                        ACK

                                                                                                                                        ACK

                                                                                                                                        FIN

                                                                                                                                        close

                                                                                                                                        close

                                                                                                                                        closed

                                                                                                                                        tim

                                                                                                                                        ed w

                                                                                                                                        ait

                                                                                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                        Step 3 client receives FIN replies with ACK

                                                                                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                        Closes down after timed-wait

                                                                                                                                        Step 4 server receives ACK Connection closed

                                                                                                                                        Note with small modification can handle simultaneous FINs

                                                                                                                                        client

                                                                                                                                        FIN

                                                                                                                                        server

                                                                                                                                        ACK

                                                                                                                                        ACK

                                                                                                                                        FIN

                                                                                                                                        closing

                                                                                                                                        closing

                                                                                                                                        closed

                                                                                                                                        tim

                                                                                                                                        ed w

                                                                                                                                        ait

                                                                                                                                        closed

                                                                                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                        ExampleTCP serverlifecycle

                                                                                                                                        Example TCP clientlifecycle

                                                                                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                        A few special cases

                                                                                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                        Chapter 3 outline

                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                        Principles of Congestion Control

                                                                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                        a top-10 problem

                                                                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                        large delays when congestedmaximum achievable throughput

                                                                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                        Causescosts of congestion scenario 2

                                                                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                        λin λout=

                                                                                                                                        λin λoutgtλ

                                                                                                                                        inλout

                                                                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                        (c)(a) (b)

                                                                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                        λin

                                                                                                                                        Q what happens as and increase λ

                                                                                                                                        in

                                                                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                        Causescosts of congestion scenario 3

                                                                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                        Approaches towards congestion control

                                                                                                                                        Two broad approaches towards congestion control

                                                                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                                                        small exception ndash see next page

                                                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                        sender should use available bandwidth

                                                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                        Chapter 3 outline

                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                        Congwin

                                                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                        cut CongWin in half after loss event

                                                                                                                                        8 Kbytes

                                                                                                                                        16 Kbytes

                                                                                                                                        24 Kbytes

                                                                                                                                        time

                                                                                                                                        congestionwindow

                                                                                                                                        Long-lived TCP connection

                                                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                        TCP Slow Start

                                                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                        TCP Slow Start (more)

                                                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                        Host A

                                                                                                                                        one segment

                                                                                                                                        RTT

                                                                                                                                        Host B

                                                                                                                                        time

                                                                                                                                        two segments

                                                                                                                                        four segments

                                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                        Summary TCP Congestion Control

                                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                        The Big Picture

                                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                        Slow Start (SS)

                                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                        CongestionAvoidance (CA)

                                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                        Enter slow start

                                                                                                                                        Duplicate ACK

                                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                        CongWin and Threshold not changed

                                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                        TCP throughput

                                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                        TCP Futures

                                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                        LRTTMSSsdot221

                                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                        TCP connection 1

                                                                                                                                        bottleneckrouter

                                                                                                                                        capacity R

                                                                                                                                        TCP connection 2

                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                        R

                                                                                                                                        R

                                                                                                                                        equal bandwidth share

                                                                                                                                        Connection 1 throughput

                                                                                                                                        Conn

                                                                                                                                        ecti

                                                                                                                                        on 2

                                                                                                                                        thr

                                                                                                                                        ough

                                                                                                                                        p ut

                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                        modeling slow start

                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                        Fixed congestion window (1)

                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                        latency = 2RTT + OR

                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                        Fixed congestion window (2)

                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                        Will show that the delay for one object is

                                                                                                                                        RS

                                                                                                                                        RSRTTP

                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                        RTT

                                                                                                                                        initiate TCPconnection

                                                                                                                                        requestobject

                                                                                                                                        first window= SR

                                                                                                                                        second window= 2SR

                                                                                                                                        third window= 4SR

                                                                                                                                        fourth window= 8SR

                                                                                                                                        completetransmissionobject

                                                                                                                                        delivered

                                                                                                                                        time atclient

                                                                                                                                        time atserver

                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                        Server idles P=2 times

                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                        RS

                                                                                                                                        RSRTTPRTT

                                                                                                                                        RO

                                                                                                                                        RSRTT

                                                                                                                                        RSRTT

                                                                                                                                        RO

                                                                                                                                        idleTimeRTTRO

                                                                                                                                        P

                                                                                                                                        kP

                                                                                                                                        k

                                                                                                                                        P

                                                                                                                                        pp

                                                                                                                                        )12(][2

                                                                                                                                        ]2[2

                                                                                                                                        2delay

                                                                                                                                        1

                                                                                                                                        1

                                                                                                                                        1

                                                                                                                                        minusminus+++=

                                                                                                                                        minus+++=

                                                                                                                                        ++=

                                                                                                                                        minus

                                                                                                                                        =

                                                                                                                                        =

                                                                                                                                        sum

                                                                                                                                        sum

                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                        RS k =⎥⎦

                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                        +minus

                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                        RSk

                                                                                                                                        RTT

                                                                                                                                        initiate TCPconnection

                                                                                                                                        requestobject

                                                                                                                                        first window= SR

                                                                                                                                        second window= 2SR

                                                                                                                                        third window= 4SR

                                                                                                                                        fourth window= 8SR

                                                                                                                                        completetransmissionobject

                                                                                                                                        delivered

                                                                                                                                        time atclient

                                                                                                                                        time atserver

                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                        How do we calculate K

                                                                                                                                        ⎥⎥⎤

                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                        +ge=

                                                                                                                                        geminus=

                                                                                                                                        ge+++=

                                                                                                                                        ge+++=minus

                                                                                                                                        minus

                                                                                                                                        )1(log

                                                                                                                                        )1(logmin

                                                                                                                                        12min

                                                                                                                                        222min222min

                                                                                                                                        2

                                                                                                                                        2

                                                                                                                                        110

                                                                                                                                        110

                                                                                                                                        SO

                                                                                                                                        SOkk

                                                                                                                                        SOk

                                                                                                                                        SOkOSSSkK

                                                                                                                                        k

                                                                                                                                        k

                                                                                                                                        k

                                                                                                                                        L

                                                                                                                                        L

                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                        02468

                                                                                                                                        101214161820

                                                                                                                                        28Kbps

                                                                                                                                        100Kbps

                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                        non-persistent

                                                                                                                                        persistent

                                                                                                                                        parallel non-persistent

                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                        0

                                                                                                                                        10

                                                                                                                                        20

                                                                                                                                        30

                                                                                                                                        40

                                                                                                                                        50

                                                                                                                                        60

                                                                                                                                        70

                                                                                                                                        28Kbps

                                                                                                                                        100Kbps

                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                        non-persistent

                                                                                                                                        persistent

                                                                                                                                        parallel non-persistent

                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                        UDPTCP

                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • Transport services and protocols
                                                                                                                                        • Transport vs network layer
                                                                                                                                        • Transport-layer protocols
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                        • How demultiplexing works
                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                        • Connection-oriented demux
                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                        • UDP more
                                                                                                                                        • UDP checksum
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                        • Incremental Improvements
                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                        • rdt20 FSM specification
                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                        • rdt20 error scenario
                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                        • rdt21 discussion
                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                        • rdt30 sender
                                                                                                                                        • rdt30 in action
                                                                                                                                        • rdt30 in action
                                                                                                                                        • Performance of rdt30
                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                        • Pipelined protocols
                                                                                                                                        • Pipelined protocols
                                                                                                                                        • Pipelining increased utilization
                                                                                                                                        • Go-Back-N
                                                                                                                                        • GBN Sender
                                                                                                                                        • GBN sender extended FSM
                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                        • More on receiver
                                                                                                                                        • GBN inaction
                                                                                                                                        • Selective Repeat
                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                        • Selective repeat
                                                                                                                                        • Selective repeat in action
                                                                                                                                        • Selective repeat dilemma
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                        • More TCP Details
                                                                                                                                        • Even More TCP Details
                                                                                                                                        • TCP segment structure
                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                        • Example RTT estimation
                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • TCP reliable data transfer
                                                                                                                                        • TCP sender events
                                                                                                                                        • TCP sender(simplified)
                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                        • More on Sender Policies
                                                                                                                                        • Fast Retransmit
                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • TCP Flow Control
                                                                                                                                        • TCP Flow Control
                                                                                                                                        • TCP segment structure
                                                                                                                                        • TCP Flow control how it works
                                                                                                                                        • Technical Issue
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • TCP Connection Management
                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                        • A few special cases
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • Principles of Congestion Control
                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                        • Approaches towards congestion control
                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                        • Chapter 3 outline
                                                                                                                                        • TCP Congestion Control
                                                                                                                                        • TCP AIMD
                                                                                                                                        • TCP Slow Start
                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                        • The Big Picture
                                                                                                                                        • TCP sender congestion control
                                                                                                                                        • TCP throughput
                                                                                                                                        • TCP Futures
                                                                                                                                        • TCP Fairness
                                                                                                                                        • Why is TCP fair
                                                                                                                                        • Fairness (more)
                                                                                                                                        • TCP Latency Modeling
                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                        • HTTP Modeling
                                                                                                                                        • Chapter 3 Summary

                                                                                                                                          3 Transport Layer 69Comp 361 Spring 2005

                                                                                                                                          TCP sender eventsdata rcvd from app

                                                                                                                                          Create segment with seq seq is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval TimeOutInterval

                                                                                                                                          timeoutretransmit segment that caused timeoutrestart timer

                                                                                                                                          Ack rcvdIf acknowledges previously unackedsegments

                                                                                                                                          update what is known to be ackedstart timer if there are outstanding segments

                                                                                                                                          TCP sender(simplified)

                                                                                                                                          NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                          loop (forever) switch(event)

                                                                                                                                          event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                          start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                          event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                          smallest sequence numberstart timer

                                                                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                          start timer

                                                                                                                                          end of loop forever

                                                                                                                                          Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                          3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                          3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                          TCP retransmission scenariosHost A

                                                                                                                                          Seq=100 20 bytes data

                                                                                                                                          ACK=100

                                                                                                                                          timepremature timeout

                                                                                                                                          Host B

                                                                                                                                          Seq=92 8 bytes data

                                                                                                                                          ACK=120

                                                                                                                                          Seq=92 8 bytes data

                                                                                                                                          Seq=

                                                                                                                                          92 t

                                                                                                                                          imeo

                                                                                                                                          ut

                                                                                                                                          ACK=120

                                                                                                                                          Host A

                                                                                                                                          Seq=92 8 bytes data

                                                                                                                                          ACK=100

                                                                                                                                          loss

                                                                                                                                          tim

                                                                                                                                          eout

                                                                                                                                          lost ACK scenario

                                                                                                                                          Host B

                                                                                                                                          X

                                                                                                                                          Seq=92 8 bytes data

                                                                                                                                          ACK=100

                                                                                                                                          time

                                                                                                                                          SendBase= 120

                                                                                                                                          SendBase= 120

                                                                                                                                          Sendbase= 100

                                                                                                                                          Seq=

                                                                                                                                          92 t

                                                                                                                                          imeo

                                                                                                                                          utSendBase

                                                                                                                                          = 100

                                                                                                                                          3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                          TCP retransmission scenarios (more)Host A

                                                                                                                                          Seq=92 8 bytes data

                                                                                                                                          ACK=100

                                                                                                                                          loss

                                                                                                                                          tim

                                                                                                                                          eout

                                                                                                                                          Cumulative ACK scenario

                                                                                                                                          Host B

                                                                                                                                          X

                                                                                                                                          Seq=100 20 bytes data

                                                                                                                                          ACK=120

                                                                                                                                          time

                                                                                                                                          SendBase= 120

                                                                                                                                          3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                          TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                          Event at Receiver

                                                                                                                                          Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                          Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                          Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                          Arrival of segment that partially or completely fills gap

                                                                                                                                          TCP Receiver action

                                                                                                                                          Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                          Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                          Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                          Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                          3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                          More on Sender Policies

                                                                                                                                          Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                          3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                          Fast Retransmit

                                                                                                                                          Time-out period often relatively long

                                                                                                                                          long delay before resending lost packet

                                                                                                                                          Detect lost segments via duplicate ACKs

                                                                                                                                          Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                          If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                          fast retransmit resend segment before timer expires

                                                                                                                                          3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                          Fast retransmit algorithm

                                                                                                                                          event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                          SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                          start timer

                                                                                                                                          else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                          resend segment with sequence number y

                                                                                                                                          a duplicate ACK for already ACKed segment

                                                                                                                                          fast retransmit

                                                                                                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                          TCP GBN or Selective Repeat

                                                                                                                                          Basic TCP looks a lot like GBN

                                                                                                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                          This looks a lot like Selective Repeat

                                                                                                                                          TCP is a hybrid

                                                                                                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                          Chapter 3 outline

                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                          TCP Flow Control

                                                                                                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                          transmitting too muchtoo fast

                                                                                                                                          flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                          app process may be slow at reading from buffer

                                                                                                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                          TCP segment structure

                                                                                                                                          source port dest port

                                                                                                                                          32 bits

                                                                                                                                          applicationdata

                                                                                                                                          (variable length)

                                                                                                                                          sequence numberacknowledgement number

                                                                                                                                          Receive windowUrg data pnterchecksum

                                                                                                                                          FSRPAUheadlen

                                                                                                                                          notused

                                                                                                                                          Options (variable length)

                                                                                                                                          URG urgent data (generally not used)

                                                                                                                                          ACK ACK valid

                                                                                                                                          PSH push data now(generally not used)

                                                                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                                                                          commands)

                                                                                                                                          bytes rcvr willingto accept

                                                                                                                                          Internetchecksum

                                                                                                                                          (as in UDP)

                                                                                                                                          countingby bytes of data(not segments)

                                                                                                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                          TCP Flow control how it works

                                                                                                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                          LastByteRead]

                                                                                                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                          guarantees receive buffer doesnrsquot overflow

                                                                                                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                          Technical Issue

                                                                                                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                          Note on UDP

                                                                                                                                          UDP has no flow control

                                                                                                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                          Chapter 3 outline

                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                          TCP Connection Management

                                                                                                                                          Three way handshakeStep 1 client end system sends

                                                                                                                                          TCP SYN control segment to server

                                                                                                                                          specifies client_isn the initial seq No application data

                                                                                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                          Allocate buffersAllocates buffersCan include application data

                                                                                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                          server

                                                                                                                                          Connection granted (SYN=1 server_isn

                                                                                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                                                                                          ack=client_isn+1)

                                                                                                                                          ack=server_isn+1

                                                                                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                          Closing a connection

                                                                                                                                          client closes socketclientSocketclose()

                                                                                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                          client

                                                                                                                                          FIN

                                                                                                                                          server

                                                                                                                                          ACK

                                                                                                                                          ACK

                                                                                                                                          FIN

                                                                                                                                          close

                                                                                                                                          close

                                                                                                                                          closed

                                                                                                                                          tim

                                                                                                                                          ed w

                                                                                                                                          ait

                                                                                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                          Step 3 client receives FIN replies with ACK

                                                                                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                          Closes down after timed-wait

                                                                                                                                          Step 4 server receives ACK Connection closed

                                                                                                                                          Note with small modification can handle simultaneous FINs

                                                                                                                                          client

                                                                                                                                          FIN

                                                                                                                                          server

                                                                                                                                          ACK

                                                                                                                                          ACK

                                                                                                                                          FIN

                                                                                                                                          closing

                                                                                                                                          closing

                                                                                                                                          closed

                                                                                                                                          tim

                                                                                                                                          ed w

                                                                                                                                          ait

                                                                                                                                          closed

                                                                                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                          ExampleTCP serverlifecycle

                                                                                                                                          Example TCP clientlifecycle

                                                                                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                          A few special cases

                                                                                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                          Chapter 3 outline

                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                          Principles of Congestion Control

                                                                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                          a top-10 problem

                                                                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                          large delays when congestedmaximum achievable throughput

                                                                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                          Causescosts of congestion scenario 2

                                                                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                          λin λout=

                                                                                                                                          λin λoutgtλ

                                                                                                                                          inλout

                                                                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                          (c)(a) (b)

                                                                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                          λin

                                                                                                                                          Q what happens as and increase λ

                                                                                                                                          in

                                                                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                          Causescosts of congestion scenario 3

                                                                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                          Approaches towards congestion control

                                                                                                                                          Two broad approaches towards congestion control

                                                                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                                                                          small exception ndash see next page

                                                                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                          sender should use available bandwidth

                                                                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                          Chapter 3 outline

                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                          Congwin

                                                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                          cut CongWin in half after loss event

                                                                                                                                          8 Kbytes

                                                                                                                                          16 Kbytes

                                                                                                                                          24 Kbytes

                                                                                                                                          time

                                                                                                                                          congestionwindow

                                                                                                                                          Long-lived TCP connection

                                                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                          TCP Slow Start

                                                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                          TCP Slow Start (more)

                                                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                          Host A

                                                                                                                                          one segment

                                                                                                                                          RTT

                                                                                                                                          Host B

                                                                                                                                          time

                                                                                                                                          two segments

                                                                                                                                          four segments

                                                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                          Summary TCP Congestion Control

                                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                          The Big Picture

                                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                          Slow Start (SS)

                                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                          CongestionAvoidance (CA)

                                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                          Enter slow start

                                                                                                                                          Duplicate ACK

                                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                          CongWin and Threshold not changed

                                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                          TCP throughput

                                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                          TCP Futures

                                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                          LRTTMSSsdot221

                                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                          TCP connection 1

                                                                                                                                          bottleneckrouter

                                                                                                                                          capacity R

                                                                                                                                          TCP connection 2

                                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                          R

                                                                                                                                          R

                                                                                                                                          equal bandwidth share

                                                                                                                                          Connection 1 throughput

                                                                                                                                          Conn

                                                                                                                                          ecti

                                                                                                                                          on 2

                                                                                                                                          thr

                                                                                                                                          ough

                                                                                                                                          p ut

                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                          modeling slow start

                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                          Fixed congestion window (1)

                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                          latency = 2RTT + OR

                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                          Fixed congestion window (2)

                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                          Will show that the delay for one object is

                                                                                                                                          RS

                                                                                                                                          RSRTTP

                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                          RTT

                                                                                                                                          initiate TCPconnection

                                                                                                                                          requestobject

                                                                                                                                          first window= SR

                                                                                                                                          second window= 2SR

                                                                                                                                          third window= 4SR

                                                                                                                                          fourth window= 8SR

                                                                                                                                          completetransmissionobject

                                                                                                                                          delivered

                                                                                                                                          time atclient

                                                                                                                                          time atserver

                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                          Server idles P=2 times

                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                          RS

                                                                                                                                          RSRTTPRTT

                                                                                                                                          RO

                                                                                                                                          RSRTT

                                                                                                                                          RSRTT

                                                                                                                                          RO

                                                                                                                                          idleTimeRTTRO

                                                                                                                                          P

                                                                                                                                          kP

                                                                                                                                          k

                                                                                                                                          P

                                                                                                                                          pp

                                                                                                                                          )12(][2

                                                                                                                                          ]2[2

                                                                                                                                          2delay

                                                                                                                                          1

                                                                                                                                          1

                                                                                                                                          1

                                                                                                                                          minusminus+++=

                                                                                                                                          minus+++=

                                                                                                                                          ++=

                                                                                                                                          minus

                                                                                                                                          =

                                                                                                                                          =

                                                                                                                                          sum

                                                                                                                                          sum

                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                          RS k =⎥⎦

                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                          +minus

                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                          RSk

                                                                                                                                          RTT

                                                                                                                                          initiate TCPconnection

                                                                                                                                          requestobject

                                                                                                                                          first window= SR

                                                                                                                                          second window= 2SR

                                                                                                                                          third window= 4SR

                                                                                                                                          fourth window= 8SR

                                                                                                                                          completetransmissionobject

                                                                                                                                          delivered

                                                                                                                                          time atclient

                                                                                                                                          time atserver

                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                          How do we calculate K

                                                                                                                                          ⎥⎥⎤

                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                          +ge=

                                                                                                                                          geminus=

                                                                                                                                          ge+++=

                                                                                                                                          ge+++=minus

                                                                                                                                          minus

                                                                                                                                          )1(log

                                                                                                                                          )1(logmin

                                                                                                                                          12min

                                                                                                                                          222min222min

                                                                                                                                          2

                                                                                                                                          2

                                                                                                                                          110

                                                                                                                                          110

                                                                                                                                          SO

                                                                                                                                          SOkk

                                                                                                                                          SOk

                                                                                                                                          SOkOSSSkK

                                                                                                                                          k

                                                                                                                                          k

                                                                                                                                          k

                                                                                                                                          L

                                                                                                                                          L

                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                          02468

                                                                                                                                          101214161820

                                                                                                                                          28Kbps

                                                                                                                                          100Kbps

                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                          non-persistent

                                                                                                                                          persistent

                                                                                                                                          parallel non-persistent

                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                          0

                                                                                                                                          10

                                                                                                                                          20

                                                                                                                                          30

                                                                                                                                          40

                                                                                                                                          50

                                                                                                                                          60

                                                                                                                                          70

                                                                                                                                          28Kbps

                                                                                                                                          100Kbps

                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                          non-persistent

                                                                                                                                          persistent

                                                                                                                                          parallel non-persistent

                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                          UDPTCP

                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • Transport services and protocols
                                                                                                                                          • Transport vs network layer
                                                                                                                                          • Transport-layer protocols
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                          • How demultiplexing works
                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                          • Connection-oriented demux
                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                          • UDP more
                                                                                                                                          • UDP checksum
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                          • Incremental Improvements
                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                          • rdt20 FSM specification
                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                          • rdt20 error scenario
                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                          • rdt21 discussion
                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                          • rdt30 sender
                                                                                                                                          • rdt30 in action
                                                                                                                                          • rdt30 in action
                                                                                                                                          • Performance of rdt30
                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                          • Pipelined protocols
                                                                                                                                          • Pipelined protocols
                                                                                                                                          • Pipelining increased utilization
                                                                                                                                          • Go-Back-N
                                                                                                                                          • GBN Sender
                                                                                                                                          • GBN sender extended FSM
                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                          • More on receiver
                                                                                                                                          • GBN inaction
                                                                                                                                          • Selective Repeat
                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                          • Selective repeat
                                                                                                                                          • Selective repeat in action
                                                                                                                                          • Selective repeat dilemma
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                          • More TCP Details
                                                                                                                                          • Even More TCP Details
                                                                                                                                          • TCP segment structure
                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                          • Example RTT estimation
                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • TCP reliable data transfer
                                                                                                                                          • TCP sender events
                                                                                                                                          • TCP sender(simplified)
                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                          • More on Sender Policies
                                                                                                                                          • Fast Retransmit
                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • TCP Flow Control
                                                                                                                                          • TCP Flow Control
                                                                                                                                          • TCP segment structure
                                                                                                                                          • TCP Flow control how it works
                                                                                                                                          • Technical Issue
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • TCP Connection Management
                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                          • A few special cases
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • Principles of Congestion Control
                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                          • Approaches towards congestion control
                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                          • Chapter 3 outline
                                                                                                                                          • TCP Congestion Control
                                                                                                                                          • TCP AIMD
                                                                                                                                          • TCP Slow Start
                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                          • The Big Picture
                                                                                                                                          • TCP sender congestion control
                                                                                                                                          • TCP throughput
                                                                                                                                          • TCP Futures
                                                                                                                                          • TCP Fairness
                                                                                                                                          • Why is TCP fair
                                                                                                                                          • Fairness (more)
                                                                                                                                          • TCP Latency Modeling
                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                          • HTTP Modeling
                                                                                                                                          • Chapter 3 Summary

                                                                                                                                            TCP sender(simplified)

                                                                                                                                            NextSeqNum = InitialSeqNumSendBase = InitialSeqNum

                                                                                                                                            loop (forever) switch(event)

                                                                                                                                            event data received from application above create TCP segment with sequence number NextSeqNumif (timer currently not running)

                                                                                                                                            start timerpass segment to IP NextSeqNum = NextSeqNum + length(data)

                                                                                                                                            event timer timeoutretransmit not-yet-acknowledged segment with

                                                                                                                                            smallest sequence numberstart timer

                                                                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                            start timer

                                                                                                                                            end of loop forever

                                                                                                                                            Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked

                                                                                                                                            3 Transport Layer 70Comp 361 Spring 2005

                                                                                                                                            3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                            TCP retransmission scenariosHost A

                                                                                                                                            Seq=100 20 bytes data

                                                                                                                                            ACK=100

                                                                                                                                            timepremature timeout

                                                                                                                                            Host B

                                                                                                                                            Seq=92 8 bytes data

                                                                                                                                            ACK=120

                                                                                                                                            Seq=92 8 bytes data

                                                                                                                                            Seq=

                                                                                                                                            92 t

                                                                                                                                            imeo

                                                                                                                                            ut

                                                                                                                                            ACK=120

                                                                                                                                            Host A

                                                                                                                                            Seq=92 8 bytes data

                                                                                                                                            ACK=100

                                                                                                                                            loss

                                                                                                                                            tim

                                                                                                                                            eout

                                                                                                                                            lost ACK scenario

                                                                                                                                            Host B

                                                                                                                                            X

                                                                                                                                            Seq=92 8 bytes data

                                                                                                                                            ACK=100

                                                                                                                                            time

                                                                                                                                            SendBase= 120

                                                                                                                                            SendBase= 120

                                                                                                                                            Sendbase= 100

                                                                                                                                            Seq=

                                                                                                                                            92 t

                                                                                                                                            imeo

                                                                                                                                            utSendBase

                                                                                                                                            = 100

                                                                                                                                            3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                            TCP retransmission scenarios (more)Host A

                                                                                                                                            Seq=92 8 bytes data

                                                                                                                                            ACK=100

                                                                                                                                            loss

                                                                                                                                            tim

                                                                                                                                            eout

                                                                                                                                            Cumulative ACK scenario

                                                                                                                                            Host B

                                                                                                                                            X

                                                                                                                                            Seq=100 20 bytes data

                                                                                                                                            ACK=120

                                                                                                                                            time

                                                                                                                                            SendBase= 120

                                                                                                                                            3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                            TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                            Event at Receiver

                                                                                                                                            Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                            Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                            Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                            Arrival of segment that partially or completely fills gap

                                                                                                                                            TCP Receiver action

                                                                                                                                            Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                            Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                            Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                            Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                            3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                            More on Sender Policies

                                                                                                                                            Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                            3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                            Fast Retransmit

                                                                                                                                            Time-out period often relatively long

                                                                                                                                            long delay before resending lost packet

                                                                                                                                            Detect lost segments via duplicate ACKs

                                                                                                                                            Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                            If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                            fast retransmit resend segment before timer expires

                                                                                                                                            3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                            Fast retransmit algorithm

                                                                                                                                            event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                            SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                            start timer

                                                                                                                                            else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                            resend segment with sequence number y

                                                                                                                                            a duplicate ACK for already ACKed segment

                                                                                                                                            fast retransmit

                                                                                                                                            3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                            TCP GBN or Selective Repeat

                                                                                                                                            Basic TCP looks a lot like GBN

                                                                                                                                            Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                            This looks a lot like Selective Repeat

                                                                                                                                            TCP is a hybrid

                                                                                                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                            Chapter 3 outline

                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                            TCP Flow Control

                                                                                                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                            transmitting too muchtoo fast

                                                                                                                                            flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                            app process may be slow at reading from buffer

                                                                                                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                            TCP segment structure

                                                                                                                                            source port dest port

                                                                                                                                            32 bits

                                                                                                                                            applicationdata

                                                                                                                                            (variable length)

                                                                                                                                            sequence numberacknowledgement number

                                                                                                                                            Receive windowUrg data pnterchecksum

                                                                                                                                            FSRPAUheadlen

                                                                                                                                            notused

                                                                                                                                            Options (variable length)

                                                                                                                                            URG urgent data (generally not used)

                                                                                                                                            ACK ACK valid

                                                                                                                                            PSH push data now(generally not used)

                                                                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                                                                            commands)

                                                                                                                                            bytes rcvr willingto accept

                                                                                                                                            Internetchecksum

                                                                                                                                            (as in UDP)

                                                                                                                                            countingby bytes of data(not segments)

                                                                                                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                            TCP Flow control how it works

                                                                                                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                            LastByteRead]

                                                                                                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                            guarantees receive buffer doesnrsquot overflow

                                                                                                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                            Technical Issue

                                                                                                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                            Note on UDP

                                                                                                                                            UDP has no flow control

                                                                                                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                            Chapter 3 outline

                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                            TCP Connection Management

                                                                                                                                            Three way handshakeStep 1 client end system sends

                                                                                                                                            TCP SYN control segment to server

                                                                                                                                            specifies client_isn the initial seq No application data

                                                                                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                            Allocate buffersAllocates buffersCan include application data

                                                                                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                            server

                                                                                                                                            Connection granted (SYN=1 server_isn

                                                                                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                                                                                            ack=client_isn+1)

                                                                                                                                            ack=server_isn+1

                                                                                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                            Closing a connection

                                                                                                                                            client closes socketclientSocketclose()

                                                                                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                            client

                                                                                                                                            FIN

                                                                                                                                            server

                                                                                                                                            ACK

                                                                                                                                            ACK

                                                                                                                                            FIN

                                                                                                                                            close

                                                                                                                                            close

                                                                                                                                            closed

                                                                                                                                            tim

                                                                                                                                            ed w

                                                                                                                                            ait

                                                                                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                            Step 3 client receives FIN replies with ACK

                                                                                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                            Closes down after timed-wait

                                                                                                                                            Step 4 server receives ACK Connection closed

                                                                                                                                            Note with small modification can handle simultaneous FINs

                                                                                                                                            client

                                                                                                                                            FIN

                                                                                                                                            server

                                                                                                                                            ACK

                                                                                                                                            ACK

                                                                                                                                            FIN

                                                                                                                                            closing

                                                                                                                                            closing

                                                                                                                                            closed

                                                                                                                                            tim

                                                                                                                                            ed w

                                                                                                                                            ait

                                                                                                                                            closed

                                                                                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                            ExampleTCP serverlifecycle

                                                                                                                                            Example TCP clientlifecycle

                                                                                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                            A few special cases

                                                                                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                            Chapter 3 outline

                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                            Principles of Congestion Control

                                                                                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                            a top-10 problem

                                                                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                            large delays when congestedmaximum achievable throughput

                                                                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                            Causescosts of congestion scenario 2

                                                                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                            λin λout=

                                                                                                                                            λin λoutgtλ

                                                                                                                                            inλout

                                                                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                            (c)(a) (b)

                                                                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                            λin

                                                                                                                                            Q what happens as and increase λ

                                                                                                                                            in

                                                                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                            Causescosts of congestion scenario 3

                                                                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                            Approaches towards congestion control

                                                                                                                                            Two broad approaches towards congestion control

                                                                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                                                                            small exception ndash see next page

                                                                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                            sender should use available bandwidth

                                                                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                            Chapter 3 outline

                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                            Congwin

                                                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                            cut CongWin in half after loss event

                                                                                                                                            8 Kbytes

                                                                                                                                            16 Kbytes

                                                                                                                                            24 Kbytes

                                                                                                                                            time

                                                                                                                                            congestionwindow

                                                                                                                                            Long-lived TCP connection

                                                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                            TCP Slow Start

                                                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                            TCP Slow Start (more)

                                                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                            Host A

                                                                                                                                            one segment

                                                                                                                                            RTT

                                                                                                                                            Host B

                                                                                                                                            time

                                                                                                                                            two segments

                                                                                                                                            four segments

                                                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                            Summary TCP Congestion Control

                                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                            The Big Picture

                                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                            Slow Start (SS)

                                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                            CongestionAvoidance (CA)

                                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                            Enter slow start

                                                                                                                                            Duplicate ACK

                                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                            CongWin and Threshold not changed

                                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                            TCP throughput

                                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                            TCP Futures

                                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                            LRTTMSSsdot221

                                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                            TCP connection 1

                                                                                                                                            bottleneckrouter

                                                                                                                                            capacity R

                                                                                                                                            TCP connection 2

                                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                            R

                                                                                                                                            R

                                                                                                                                            equal bandwidth share

                                                                                                                                            Connection 1 throughput

                                                                                                                                            Conn

                                                                                                                                            ecti

                                                                                                                                            on 2

                                                                                                                                            thr

                                                                                                                                            ough

                                                                                                                                            p ut

                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                                            do not want rate throttled by congestion control

                                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                            modeling slow start

                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                            Fixed congestion window (1)

                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                            latency = 2RTT + OR

                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                            Fixed congestion window (2)

                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                            Will show that the delay for one object is

                                                                                                                                            RS

                                                                                                                                            RSRTTP

                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                            RTT

                                                                                                                                            initiate TCPconnection

                                                                                                                                            requestobject

                                                                                                                                            first window= SR

                                                                                                                                            second window= 2SR

                                                                                                                                            third window= 4SR

                                                                                                                                            fourth window= 8SR

                                                                                                                                            completetransmissionobject

                                                                                                                                            delivered

                                                                                                                                            time atclient

                                                                                                                                            time atserver

                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                            Server idles P=2 times

                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                            RS

                                                                                                                                            RSRTTPRTT

                                                                                                                                            RO

                                                                                                                                            RSRTT

                                                                                                                                            RSRTT

                                                                                                                                            RO

                                                                                                                                            idleTimeRTTRO

                                                                                                                                            P

                                                                                                                                            kP

                                                                                                                                            k

                                                                                                                                            P

                                                                                                                                            pp

                                                                                                                                            )12(][2

                                                                                                                                            ]2[2

                                                                                                                                            2delay

                                                                                                                                            1

                                                                                                                                            1

                                                                                                                                            1

                                                                                                                                            minusminus+++=

                                                                                                                                            minus+++=

                                                                                                                                            ++=

                                                                                                                                            minus

                                                                                                                                            =

                                                                                                                                            =

                                                                                                                                            sum

                                                                                                                                            sum

                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                            RS k =⎥⎦

                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                            +minus

                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                            RSk

                                                                                                                                            RTT

                                                                                                                                            initiate TCPconnection

                                                                                                                                            requestobject

                                                                                                                                            first window= SR

                                                                                                                                            second window= 2SR

                                                                                                                                            third window= 4SR

                                                                                                                                            fourth window= 8SR

                                                                                                                                            completetransmissionobject

                                                                                                                                            delivered

                                                                                                                                            time atclient

                                                                                                                                            time atserver

                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                            How do we calculate K

                                                                                                                                            ⎥⎥⎤

                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                            +ge=

                                                                                                                                            geminus=

                                                                                                                                            ge+++=

                                                                                                                                            ge+++=minus

                                                                                                                                            minus

                                                                                                                                            )1(log

                                                                                                                                            )1(logmin

                                                                                                                                            12min

                                                                                                                                            222min222min

                                                                                                                                            2

                                                                                                                                            2

                                                                                                                                            110

                                                                                                                                            110

                                                                                                                                            SO

                                                                                                                                            SOkk

                                                                                                                                            SOk

                                                                                                                                            SOkOSSSkK

                                                                                                                                            k

                                                                                                                                            k

                                                                                                                                            k

                                                                                                                                            L

                                                                                                                                            L

                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                            02468

                                                                                                                                            101214161820

                                                                                                                                            28Kbps

                                                                                                                                            100Kbps

                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                            non-persistent

                                                                                                                                            persistent

                                                                                                                                            parallel non-persistent

                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                            0

                                                                                                                                            10

                                                                                                                                            20

                                                                                                                                            30

                                                                                                                                            40

                                                                                                                                            50

                                                                                                                                            60

                                                                                                                                            70

                                                                                                                                            28Kbps

                                                                                                                                            100Kbps

                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                            non-persistent

                                                                                                                                            persistent

                                                                                                                                            parallel non-persistent

                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                            UDPTCP

                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • Transport services and protocols
                                                                                                                                            • Transport vs network layer
                                                                                                                                            • Transport-layer protocols
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                            • How demultiplexing works
                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                            • Connection-oriented demux
                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                            • UDP more
                                                                                                                                            • UDP checksum
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                            • Incremental Improvements
                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                            • rdt20 FSM specification
                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                            • rdt20 error scenario
                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                            • rdt21 discussion
                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                            • rdt30 sender
                                                                                                                                            • rdt30 in action
                                                                                                                                            • rdt30 in action
                                                                                                                                            • Performance of rdt30
                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                            • Pipelined protocols
                                                                                                                                            • Pipelined protocols
                                                                                                                                            • Pipelining increased utilization
                                                                                                                                            • Go-Back-N
                                                                                                                                            • GBN Sender
                                                                                                                                            • GBN sender extended FSM
                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                            • More on receiver
                                                                                                                                            • GBN inaction
                                                                                                                                            • Selective Repeat
                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                            • Selective repeat
                                                                                                                                            • Selective repeat in action
                                                                                                                                            • Selective repeat dilemma
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                            • More TCP Details
                                                                                                                                            • Even More TCP Details
                                                                                                                                            • TCP segment structure
                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                            • Example RTT estimation
                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • TCP reliable data transfer
                                                                                                                                            • TCP sender events
                                                                                                                                            • TCP sender(simplified)
                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                            • More on Sender Policies
                                                                                                                                            • Fast Retransmit
                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • TCP Flow Control
                                                                                                                                            • TCP Flow Control
                                                                                                                                            • TCP segment structure
                                                                                                                                            • TCP Flow control how it works
                                                                                                                                            • Technical Issue
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • TCP Connection Management
                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                            • A few special cases
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • Principles of Congestion Control
                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                            • Approaches towards congestion control
                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                            • Chapter 3 outline
                                                                                                                                            • TCP Congestion Control
                                                                                                                                            • TCP AIMD
                                                                                                                                            • TCP Slow Start
                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                            • The Big Picture
                                                                                                                                            • TCP sender congestion control
                                                                                                                                            • TCP throughput
                                                                                                                                            • TCP Futures
                                                                                                                                            • TCP Fairness
                                                                                                                                            • Why is TCP fair
                                                                                                                                            • Fairness (more)
                                                                                                                                            • TCP Latency Modeling
                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                            • HTTP Modeling
                                                                                                                                            • Chapter 3 Summary

                                                                                                                                              3 Transport Layer 71Comp 361 Spring 2005

                                                                                                                                              TCP retransmission scenariosHost A

                                                                                                                                              Seq=100 20 bytes data

                                                                                                                                              ACK=100

                                                                                                                                              timepremature timeout

                                                                                                                                              Host B

                                                                                                                                              Seq=92 8 bytes data

                                                                                                                                              ACK=120

                                                                                                                                              Seq=92 8 bytes data

                                                                                                                                              Seq=

                                                                                                                                              92 t

                                                                                                                                              imeo

                                                                                                                                              ut

                                                                                                                                              ACK=120

                                                                                                                                              Host A

                                                                                                                                              Seq=92 8 bytes data

                                                                                                                                              ACK=100

                                                                                                                                              loss

                                                                                                                                              tim

                                                                                                                                              eout

                                                                                                                                              lost ACK scenario

                                                                                                                                              Host B

                                                                                                                                              X

                                                                                                                                              Seq=92 8 bytes data

                                                                                                                                              ACK=100

                                                                                                                                              time

                                                                                                                                              SendBase= 120

                                                                                                                                              SendBase= 120

                                                                                                                                              Sendbase= 100

                                                                                                                                              Seq=

                                                                                                                                              92 t

                                                                                                                                              imeo

                                                                                                                                              utSendBase

                                                                                                                                              = 100

                                                                                                                                              3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                              TCP retransmission scenarios (more)Host A

                                                                                                                                              Seq=92 8 bytes data

                                                                                                                                              ACK=100

                                                                                                                                              loss

                                                                                                                                              tim

                                                                                                                                              eout

                                                                                                                                              Cumulative ACK scenario

                                                                                                                                              Host B

                                                                                                                                              X

                                                                                                                                              Seq=100 20 bytes data

                                                                                                                                              ACK=120

                                                                                                                                              time

                                                                                                                                              SendBase= 120

                                                                                                                                              3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                              TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                              Event at Receiver

                                                                                                                                              Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                              Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                              Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                              Arrival of segment that partially or completely fills gap

                                                                                                                                              TCP Receiver action

                                                                                                                                              Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                              Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                              Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                              Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                              3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                              More on Sender Policies

                                                                                                                                              Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                              3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                              Fast Retransmit

                                                                                                                                              Time-out period often relatively long

                                                                                                                                              long delay before resending lost packet

                                                                                                                                              Detect lost segments via duplicate ACKs

                                                                                                                                              Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                              If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                              fast retransmit resend segment before timer expires

                                                                                                                                              3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                              Fast retransmit algorithm

                                                                                                                                              event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                              SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                              start timer

                                                                                                                                              else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                              resend segment with sequence number y

                                                                                                                                              a duplicate ACK for already ACKed segment

                                                                                                                                              fast retransmit

                                                                                                                                              3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                              TCP GBN or Selective Repeat

                                                                                                                                              Basic TCP looks a lot like GBN

                                                                                                                                              Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                              This looks a lot like Selective Repeat

                                                                                                                                              TCP is a hybrid

                                                                                                                                              3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                              Chapter 3 outline

                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                              TCP Flow Control

                                                                                                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                              transmitting too muchtoo fast

                                                                                                                                              flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                              app process may be slow at reading from buffer

                                                                                                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                              TCP segment structure

                                                                                                                                              source port dest port

                                                                                                                                              32 bits

                                                                                                                                              applicationdata

                                                                                                                                              (variable length)

                                                                                                                                              sequence numberacknowledgement number

                                                                                                                                              Receive windowUrg data pnterchecksum

                                                                                                                                              FSRPAUheadlen

                                                                                                                                              notused

                                                                                                                                              Options (variable length)

                                                                                                                                              URG urgent data (generally not used)

                                                                                                                                              ACK ACK valid

                                                                                                                                              PSH push data now(generally not used)

                                                                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                                                                              commands)

                                                                                                                                              bytes rcvr willingto accept

                                                                                                                                              Internetchecksum

                                                                                                                                              (as in UDP)

                                                                                                                                              countingby bytes of data(not segments)

                                                                                                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                              TCP Flow control how it works

                                                                                                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                              LastByteRead]

                                                                                                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                              guarantees receive buffer doesnrsquot overflow

                                                                                                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                              Technical Issue

                                                                                                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                              Note on UDP

                                                                                                                                              UDP has no flow control

                                                                                                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                              Chapter 3 outline

                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                              TCP Connection Management

                                                                                                                                              Three way handshakeStep 1 client end system sends

                                                                                                                                              TCP SYN control segment to server

                                                                                                                                              specifies client_isn the initial seq No application data

                                                                                                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                              seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                              Allocate buffersAllocates buffersCan include application data

                                                                                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                              server

                                                                                                                                              Connection granted (SYN=1 server_isn

                                                                                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                                                                                              ack=client_isn+1)

                                                                                                                                              ack=server_isn+1

                                                                                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                              Closing a connection

                                                                                                                                              client closes socketclientSocketclose()

                                                                                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                              client

                                                                                                                                              FIN

                                                                                                                                              server

                                                                                                                                              ACK

                                                                                                                                              ACK

                                                                                                                                              FIN

                                                                                                                                              close

                                                                                                                                              close

                                                                                                                                              closed

                                                                                                                                              tim

                                                                                                                                              ed w

                                                                                                                                              ait

                                                                                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                              Step 3 client receives FIN replies with ACK

                                                                                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                              Closes down after timed-wait

                                                                                                                                              Step 4 server receives ACK Connection closed

                                                                                                                                              Note with small modification can handle simultaneous FINs

                                                                                                                                              client

                                                                                                                                              FIN

                                                                                                                                              server

                                                                                                                                              ACK

                                                                                                                                              ACK

                                                                                                                                              FIN

                                                                                                                                              closing

                                                                                                                                              closing

                                                                                                                                              closed

                                                                                                                                              tim

                                                                                                                                              ed w

                                                                                                                                              ait

                                                                                                                                              closed

                                                                                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                              ExampleTCP serverlifecycle

                                                                                                                                              Example TCP clientlifecycle

                                                                                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                              A few special cases

                                                                                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                              Chapter 3 outline

                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                              Principles of Congestion Control

                                                                                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                              a top-10 problem

                                                                                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                              large delays when congestedmaximum achievable throughput

                                                                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                              Causescosts of congestion scenario 2

                                                                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                              λin λout=

                                                                                                                                              λin λoutgtλ

                                                                                                                                              inλout

                                                                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                              (c)(a) (b)

                                                                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                              λin

                                                                                                                                              Q what happens as and increase λ

                                                                                                                                              in

                                                                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                              Causescosts of congestion scenario 3

                                                                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                              Approaches towards congestion control

                                                                                                                                              Two broad approaches towards congestion control

                                                                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                                                                              small exception ndash see next page

                                                                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                              sender should use available bandwidth

                                                                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                              Chapter 3 outline

                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                              Congwin

                                                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                              cut CongWin in half after loss event

                                                                                                                                              8 Kbytes

                                                                                                                                              16 Kbytes

                                                                                                                                              24 Kbytes

                                                                                                                                              time

                                                                                                                                              congestionwindow

                                                                                                                                              Long-lived TCP connection

                                                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                              TCP Slow Start

                                                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                              TCP Slow Start (more)

                                                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                              Host A

                                                                                                                                              one segment

                                                                                                                                              RTT

                                                                                                                                              Host B

                                                                                                                                              time

                                                                                                                                              two segments

                                                                                                                                              four segments

                                                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                              Summary TCP Congestion Control

                                                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                              The Big Picture

                                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                              Slow Start (SS)

                                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                              CongestionAvoidance (CA)

                                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                              Enter slow start

                                                                                                                                              Duplicate ACK

                                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                              CongWin and Threshold not changed

                                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                              TCP throughput

                                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                              TCP Futures

                                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                              LRTTMSSsdot221

                                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                              TCP connection 1

                                                                                                                                              bottleneckrouter

                                                                                                                                              capacity R

                                                                                                                                              TCP connection 2

                                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                              R

                                                                                                                                              R

                                                                                                                                              equal bandwidth share

                                                                                                                                              Connection 1 throughput

                                                                                                                                              Conn

                                                                                                                                              ecti

                                                                                                                                              on 2

                                                                                                                                              thr

                                                                                                                                              ough

                                                                                                                                              p ut

                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                                              do not want rate throttled by congestion control

                                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                              modeling slow start

                                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                              Fixed congestion window (1)

                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                              latency = 2RTT + OR

                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                              Fixed congestion window (2)

                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                              Will show that the delay for one object is

                                                                                                                                              RS

                                                                                                                                              RSRTTP

                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                              RTT

                                                                                                                                              initiate TCPconnection

                                                                                                                                              requestobject

                                                                                                                                              first window= SR

                                                                                                                                              second window= 2SR

                                                                                                                                              third window= 4SR

                                                                                                                                              fourth window= 8SR

                                                                                                                                              completetransmissionobject

                                                                                                                                              delivered

                                                                                                                                              time atclient

                                                                                                                                              time atserver

                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                              Server idles P=2 times

                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                              RS

                                                                                                                                              RSRTTPRTT

                                                                                                                                              RO

                                                                                                                                              RSRTT

                                                                                                                                              RSRTT

                                                                                                                                              RO

                                                                                                                                              idleTimeRTTRO

                                                                                                                                              P

                                                                                                                                              kP

                                                                                                                                              k

                                                                                                                                              P

                                                                                                                                              pp

                                                                                                                                              )12(][2

                                                                                                                                              ]2[2

                                                                                                                                              2delay

                                                                                                                                              1

                                                                                                                                              1

                                                                                                                                              1

                                                                                                                                              minusminus+++=

                                                                                                                                              minus+++=

                                                                                                                                              ++=

                                                                                                                                              minus

                                                                                                                                              =

                                                                                                                                              =

                                                                                                                                              sum

                                                                                                                                              sum

                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                              RS k =⎥⎦

                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                              +minus

                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                              RSk

                                                                                                                                              RTT

                                                                                                                                              initiate TCPconnection

                                                                                                                                              requestobject

                                                                                                                                              first window= SR

                                                                                                                                              second window= 2SR

                                                                                                                                              third window= 4SR

                                                                                                                                              fourth window= 8SR

                                                                                                                                              completetransmissionobject

                                                                                                                                              delivered

                                                                                                                                              time atclient

                                                                                                                                              time atserver

                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                              How do we calculate K

                                                                                                                                              ⎥⎥⎤

                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                              +ge=

                                                                                                                                              geminus=

                                                                                                                                              ge+++=

                                                                                                                                              ge+++=minus

                                                                                                                                              minus

                                                                                                                                              )1(log

                                                                                                                                              )1(logmin

                                                                                                                                              12min

                                                                                                                                              222min222min

                                                                                                                                              2

                                                                                                                                              2

                                                                                                                                              110

                                                                                                                                              110

                                                                                                                                              SO

                                                                                                                                              SOkk

                                                                                                                                              SOk

                                                                                                                                              SOkOSSSkK

                                                                                                                                              k

                                                                                                                                              k

                                                                                                                                              k

                                                                                                                                              L

                                                                                                                                              L

                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                              02468

                                                                                                                                              101214161820

                                                                                                                                              28Kbps

                                                                                                                                              100Kbps

                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                              non-persistent

                                                                                                                                              persistent

                                                                                                                                              parallel non-persistent

                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                              0

                                                                                                                                              10

                                                                                                                                              20

                                                                                                                                              30

                                                                                                                                              40

                                                                                                                                              50

                                                                                                                                              60

                                                                                                                                              70

                                                                                                                                              28Kbps

                                                                                                                                              100Kbps

                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                              non-persistent

                                                                                                                                              persistent

                                                                                                                                              parallel non-persistent

                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                              UDPTCP

                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • Transport services and protocols
                                                                                                                                              • Transport vs network layer
                                                                                                                                              • Transport-layer protocols
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                              • How demultiplexing works
                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                              • Connection-oriented demux
                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                              • UDP more
                                                                                                                                              • UDP checksum
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                              • Incremental Improvements
                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                              • rdt20 FSM specification
                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                              • rdt20 error scenario
                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                              • rdt21 discussion
                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                              • rdt30 sender
                                                                                                                                              • rdt30 in action
                                                                                                                                              • rdt30 in action
                                                                                                                                              • Performance of rdt30
                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                              • Pipelined protocols
                                                                                                                                              • Pipelined protocols
                                                                                                                                              • Pipelining increased utilization
                                                                                                                                              • Go-Back-N
                                                                                                                                              • GBN Sender
                                                                                                                                              • GBN sender extended FSM
                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                              • More on receiver
                                                                                                                                              • GBN inaction
                                                                                                                                              • Selective Repeat
                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                              • Selective repeat
                                                                                                                                              • Selective repeat in action
                                                                                                                                              • Selective repeat dilemma
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                              • More TCP Details
                                                                                                                                              • Even More TCP Details
                                                                                                                                              • TCP segment structure
                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                              • Example RTT estimation
                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • TCP reliable data transfer
                                                                                                                                              • TCP sender events
                                                                                                                                              • TCP sender(simplified)
                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                              • More on Sender Policies
                                                                                                                                              • Fast Retransmit
                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • TCP Flow Control
                                                                                                                                              • TCP Flow Control
                                                                                                                                              • TCP segment structure
                                                                                                                                              • TCP Flow control how it works
                                                                                                                                              • Technical Issue
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • TCP Connection Management
                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                              • A few special cases
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • Principles of Congestion Control
                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                              • Approaches towards congestion control
                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                              • Chapter 3 outline
                                                                                                                                              • TCP Congestion Control
                                                                                                                                              • TCP AIMD
                                                                                                                                              • TCP Slow Start
                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                              • The Big Picture
                                                                                                                                              • TCP sender congestion control
                                                                                                                                              • TCP throughput
                                                                                                                                              • TCP Futures
                                                                                                                                              • TCP Fairness
                                                                                                                                              • Why is TCP fair
                                                                                                                                              • Fairness (more)
                                                                                                                                              • TCP Latency Modeling
                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                              • HTTP Modeling
                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                3 Transport Layer 72Comp 361 Spring 2005

                                                                                                                                                TCP retransmission scenarios (more)Host A

                                                                                                                                                Seq=92 8 bytes data

                                                                                                                                                ACK=100

                                                                                                                                                loss

                                                                                                                                                tim

                                                                                                                                                eout

                                                                                                                                                Cumulative ACK scenario

                                                                                                                                                Host B

                                                                                                                                                X

                                                                                                                                                Seq=100 20 bytes data

                                                                                                                                                ACK=120

                                                                                                                                                time

                                                                                                                                                SendBase= 120

                                                                                                                                                3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                                TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                                Event at Receiver

                                                                                                                                                Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                                Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                                Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                                Arrival of segment that partially or completely fills gap

                                                                                                                                                TCP Receiver action

                                                                                                                                                Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                                Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                                Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                                Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                                3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                                More on Sender Policies

                                                                                                                                                Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                                3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                                Fast Retransmit

                                                                                                                                                Time-out period often relatively long

                                                                                                                                                long delay before resending lost packet

                                                                                                                                                Detect lost segments via duplicate ACKs

                                                                                                                                                Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                                If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                                fast retransmit resend segment before timer expires

                                                                                                                                                3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                                Fast retransmit algorithm

                                                                                                                                                event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                                SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                                start timer

                                                                                                                                                else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                                resend segment with sequence number y

                                                                                                                                                a duplicate ACK for already ACKed segment

                                                                                                                                                fast retransmit

                                                                                                                                                3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                                TCP GBN or Selective Repeat

                                                                                                                                                Basic TCP looks a lot like GBN

                                                                                                                                                Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                                This looks a lot like Selective Repeat

                                                                                                                                                TCP is a hybrid

                                                                                                                                                3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                Chapter 3 outline

                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                TCP Flow Control

                                                                                                                                                Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                transmitting too muchtoo fast

                                                                                                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                app process may be slow at reading from buffer

                                                                                                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                TCP segment structure

                                                                                                                                                source port dest port

                                                                                                                                                32 bits

                                                                                                                                                applicationdata

                                                                                                                                                (variable length)

                                                                                                                                                sequence numberacknowledgement number

                                                                                                                                                Receive windowUrg data pnterchecksum

                                                                                                                                                FSRPAUheadlen

                                                                                                                                                notused

                                                                                                                                                Options (variable length)

                                                                                                                                                URG urgent data (generally not used)

                                                                                                                                                ACK ACK valid

                                                                                                                                                PSH push data now(generally not used)

                                                                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                                                                commands)

                                                                                                                                                bytes rcvr willingto accept

                                                                                                                                                Internetchecksum

                                                                                                                                                (as in UDP)

                                                                                                                                                countingby bytes of data(not segments)

                                                                                                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                TCP Flow control how it works

                                                                                                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                LastByteRead]

                                                                                                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                guarantees receive buffer doesnrsquot overflow

                                                                                                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                Technical Issue

                                                                                                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                Note on UDP

                                                                                                                                                UDP has no flow control

                                                                                                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                Chapter 3 outline

                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                TCP Connection Management

                                                                                                                                                Three way handshakeStep 1 client end system sends

                                                                                                                                                TCP SYN control segment to server

                                                                                                                                                specifies client_isn the initial seq No application data

                                                                                                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                Allocate buffersAllocates buffersCan include application data

                                                                                                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                server

                                                                                                                                                Connection granted (SYN=1 server_isn

                                                                                                                                                ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                ack=client_isn+1)

                                                                                                                                                ack=server_isn+1

                                                                                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                Closing a connection

                                                                                                                                                client closes socketclientSocketclose()

                                                                                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                client

                                                                                                                                                FIN

                                                                                                                                                server

                                                                                                                                                ACK

                                                                                                                                                ACK

                                                                                                                                                FIN

                                                                                                                                                close

                                                                                                                                                close

                                                                                                                                                closed

                                                                                                                                                tim

                                                                                                                                                ed w

                                                                                                                                                ait

                                                                                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                Step 3 client receives FIN replies with ACK

                                                                                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                Closes down after timed-wait

                                                                                                                                                Step 4 server receives ACK Connection closed

                                                                                                                                                Note with small modification can handle simultaneous FINs

                                                                                                                                                client

                                                                                                                                                FIN

                                                                                                                                                server

                                                                                                                                                ACK

                                                                                                                                                ACK

                                                                                                                                                FIN

                                                                                                                                                closing

                                                                                                                                                closing

                                                                                                                                                closed

                                                                                                                                                tim

                                                                                                                                                ed w

                                                                                                                                                ait

                                                                                                                                                closed

                                                                                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                ExampleTCP serverlifecycle

                                                                                                                                                Example TCP clientlifecycle

                                                                                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                A few special cases

                                                                                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                Chapter 3 outline

                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                Principles of Congestion Control

                                                                                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                a top-10 problem

                                                                                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                large delays when congestedmaximum achievable throughput

                                                                                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                Causescosts of congestion scenario 2

                                                                                                                                                one router finite buffers sender retransmission of lost packet

                                                                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                λin λout=

                                                                                                                                                λin λoutgtλ

                                                                                                                                                inλout

                                                                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                (c)(a) (b)

                                                                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                λin

                                                                                                                                                Q what happens as and increase λ

                                                                                                                                                in

                                                                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                Causescosts of congestion scenario 3

                                                                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                Approaches towards congestion control

                                                                                                                                                Two broad approaches towards congestion control

                                                                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                                                                small exception ndash see next page

                                                                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                sender should use available bandwidth

                                                                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                Chapter 3 outline

                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                Congwin

                                                                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                                                                throughput = w MSSRTT Bytessec

                                                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                cut CongWin in half after loss event

                                                                                                                                                8 Kbytes

                                                                                                                                                16 Kbytes

                                                                                                                                                24 Kbytes

                                                                                                                                                time

                                                                                                                                                congestionwindow

                                                                                                                                                Long-lived TCP connection

                                                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                TCP Slow Start

                                                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                TCP Slow Start (more)

                                                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                Host A

                                                                                                                                                one segment

                                                                                                                                                RTT

                                                                                                                                                Host B

                                                                                                                                                time

                                                                                                                                                two segments

                                                                                                                                                four segments

                                                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                Summary TCP Congestion Control

                                                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                The Big Picture

                                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                Slow Start (SS)

                                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                Enter slow start

                                                                                                                                                Duplicate ACK

                                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                CongWin and Threshold not changed

                                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                TCP throughput

                                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                TCP Futures

                                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                LRTTMSSsdot221

                                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                TCP connection 1

                                                                                                                                                bottleneckrouter

                                                                                                                                                capacity R

                                                                                                                                                TCP connection 2

                                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                R

                                                                                                                                                R

                                                                                                                                                equal bandwidth share

                                                                                                                                                Connection 1 throughput

                                                                                                                                                Conn

                                                                                                                                                ecti

                                                                                                                                                on 2

                                                                                                                                                thr

                                                                                                                                                ough

                                                                                                                                                p ut

                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                modeling slow start

                                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                RS

                                                                                                                                                RSRTTP

                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                RTT

                                                                                                                                                initiate TCPconnection

                                                                                                                                                requestobject

                                                                                                                                                first window= SR

                                                                                                                                                second window= 2SR

                                                                                                                                                third window= 4SR

                                                                                                                                                fourth window= 8SR

                                                                                                                                                completetransmissionobject

                                                                                                                                                delivered

                                                                                                                                                time atclient

                                                                                                                                                time atserver

                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                Server idles P=2 times

                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                RS

                                                                                                                                                RSRTTPRTT

                                                                                                                                                RO

                                                                                                                                                RSRTT

                                                                                                                                                RSRTT

                                                                                                                                                RO

                                                                                                                                                idleTimeRTTRO

                                                                                                                                                P

                                                                                                                                                kP

                                                                                                                                                k

                                                                                                                                                P

                                                                                                                                                pp

                                                                                                                                                )12(][2

                                                                                                                                                ]2[2

                                                                                                                                                2delay

                                                                                                                                                1

                                                                                                                                                1

                                                                                                                                                1

                                                                                                                                                minusminus+++=

                                                                                                                                                minus+++=

                                                                                                                                                ++=

                                                                                                                                                minus

                                                                                                                                                =

                                                                                                                                                =

                                                                                                                                                sum

                                                                                                                                                sum

                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                RS k =⎥⎦

                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                +minus

                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                RSk

                                                                                                                                                RTT

                                                                                                                                                initiate TCPconnection

                                                                                                                                                requestobject

                                                                                                                                                first window= SR

                                                                                                                                                second window= 2SR

                                                                                                                                                third window= 4SR

                                                                                                                                                fourth window= 8SR

                                                                                                                                                completetransmissionobject

                                                                                                                                                delivered

                                                                                                                                                time atclient

                                                                                                                                                time atserver

                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                How do we calculate K

                                                                                                                                                ⎥⎥⎤

                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                +ge=

                                                                                                                                                geminus=

                                                                                                                                                ge+++=

                                                                                                                                                ge+++=minus

                                                                                                                                                minus

                                                                                                                                                )1(log

                                                                                                                                                )1(logmin

                                                                                                                                                12min

                                                                                                                                                222min222min

                                                                                                                                                2

                                                                                                                                                2

                                                                                                                                                110

                                                                                                                                                110

                                                                                                                                                SO

                                                                                                                                                SOkk

                                                                                                                                                SOk

                                                                                                                                                SOkOSSSkK

                                                                                                                                                k

                                                                                                                                                k

                                                                                                                                                k

                                                                                                                                                L

                                                                                                                                                L

                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                02468

                                                                                                                                                101214161820

                                                                                                                                                28Kbps

                                                                                                                                                100Kbps

                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                non-persistent

                                                                                                                                                persistent

                                                                                                                                                parallel non-persistent

                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                0

                                                                                                                                                10

                                                                                                                                                20

                                                                                                                                                30

                                                                                                                                                40

                                                                                                                                                50

                                                                                                                                                60

                                                                                                                                                70

                                                                                                                                                28Kbps

                                                                                                                                                100Kbps

                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                non-persistent

                                                                                                                                                persistent

                                                                                                                                                parallel non-persistent

                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                UDPTCP

                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • Transport services and protocols
                                                                                                                                                • Transport vs network layer
                                                                                                                                                • Transport-layer protocols
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                • How demultiplexing works
                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                • Connection-oriented demux
                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                • UDP more
                                                                                                                                                • UDP checksum
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                • Incremental Improvements
                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                • rdt20 error scenario
                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                • rdt21 discussion
                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                • rdt30 sender
                                                                                                                                                • rdt30 in action
                                                                                                                                                • rdt30 in action
                                                                                                                                                • Performance of rdt30
                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                • Pipelined protocols
                                                                                                                                                • Pipelined protocols
                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                • Go-Back-N
                                                                                                                                                • GBN Sender
                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                • More on receiver
                                                                                                                                                • GBN inaction
                                                                                                                                                • Selective Repeat
                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                • Selective repeat
                                                                                                                                                • Selective repeat in action
                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                • More TCP Details
                                                                                                                                                • Even More TCP Details
                                                                                                                                                • TCP segment structure
                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                • Example RTT estimation
                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                • TCP sender events
                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                • More on Sender Policies
                                                                                                                                                • Fast Retransmit
                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • TCP Flow Control
                                                                                                                                                • TCP Flow Control
                                                                                                                                                • TCP segment structure
                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                • Technical Issue
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • TCP Connection Management
                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                • A few special cases
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                • Chapter 3 outline
                                                                                                                                                • TCP Congestion Control
                                                                                                                                                • TCP AIMD
                                                                                                                                                • TCP Slow Start
                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                • The Big Picture
                                                                                                                                                • TCP sender congestion control
                                                                                                                                                • TCP throughput
                                                                                                                                                • TCP Futures
                                                                                                                                                • TCP Fairness
                                                                                                                                                • Why is TCP fair
                                                                                                                                                • Fairness (more)
                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                • HTTP Modeling
                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                  3 Transport Layer 73Comp 361 Spring 2005

                                                                                                                                                  TCP ACK generation [RFC 1122 RFC 2581]

                                                                                                                                                  Event at Receiver

                                                                                                                                                  Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed

                                                                                                                                                  Arrival of in-order segment withexpected seq One other segment has ACK pending

                                                                                                                                                  Arrival of out-of-order segmenthigher-than-expect seq Gap detected

                                                                                                                                                  Arrival of segment that partially or completely fills gap

                                                                                                                                                  TCP Receiver action

                                                                                                                                                  Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK

                                                                                                                                                  Immediately send single cumulative ACK ACKing both in-order segments

                                                                                                                                                  Immediately send duplicate ACK indicating seq of next expected byte

                                                                                                                                                  Immediate send ACK provided thatsegment starts at lower end of gap

                                                                                                                                                  3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                                  More on Sender Policies

                                                                                                                                                  Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                                  3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                                  Fast Retransmit

                                                                                                                                                  Time-out period often relatively long

                                                                                                                                                  long delay before resending lost packet

                                                                                                                                                  Detect lost segments via duplicate ACKs

                                                                                                                                                  Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                                  If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                                  fast retransmit resend segment before timer expires

                                                                                                                                                  3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                                  Fast retransmit algorithm

                                                                                                                                                  event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                                  SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                                  start timer

                                                                                                                                                  else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                                  resend segment with sequence number y

                                                                                                                                                  a duplicate ACK for already ACKed segment

                                                                                                                                                  fast retransmit

                                                                                                                                                  3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                                  TCP GBN or Selective Repeat

                                                                                                                                                  Basic TCP looks a lot like GBN

                                                                                                                                                  Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                                  This looks a lot like Selective Repeat

                                                                                                                                                  TCP is a hybrid

                                                                                                                                                  3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                  Chapter 3 outline

                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                  3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                  TCP Flow Control

                                                                                                                                                  Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                  3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                  TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                  transmitting too muchtoo fast

                                                                                                                                                  flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                  speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                  app process may be slow at reading from buffer

                                                                                                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                  TCP segment structure

                                                                                                                                                  source port dest port

                                                                                                                                                  32 bits

                                                                                                                                                  applicationdata

                                                                                                                                                  (variable length)

                                                                                                                                                  sequence numberacknowledgement number

                                                                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                                                                  FSRPAUheadlen

                                                                                                                                                  notused

                                                                                                                                                  Options (variable length)

                                                                                                                                                  URG urgent data (generally not used)

                                                                                                                                                  ACK ACK valid

                                                                                                                                                  PSH push data now(generally not used)

                                                                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                                                                  commands)

                                                                                                                                                  bytes rcvr willingto accept

                                                                                                                                                  Internetchecksum

                                                                                                                                                  (as in UDP)

                                                                                                                                                  countingby bytes of data(not segments)

                                                                                                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                  TCP Flow control how it works

                                                                                                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                  LastByteRead]

                                                                                                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                  guarantees receive buffer doesnrsquot overflow

                                                                                                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                  Technical Issue

                                                                                                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                  Note on UDP

                                                                                                                                                  UDP has no flow control

                                                                                                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                  Chapter 3 outline

                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                  TCP Connection Management

                                                                                                                                                  Three way handshakeStep 1 client end system sends

                                                                                                                                                  TCP SYN control segment to server

                                                                                                                                                  specifies client_isn the initial seq No application data

                                                                                                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                  Allocate buffersAllocates buffersCan include application data

                                                                                                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                  server

                                                                                                                                                  Connection granted (SYN=1 server_isn

                                                                                                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                  ack=client_isn+1)

                                                                                                                                                  ack=server_isn+1

                                                                                                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                  Closing a connection

                                                                                                                                                  client closes socketclientSocketclose()

                                                                                                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                  client

                                                                                                                                                  FIN

                                                                                                                                                  server

                                                                                                                                                  ACK

                                                                                                                                                  ACK

                                                                                                                                                  FIN

                                                                                                                                                  close

                                                                                                                                                  close

                                                                                                                                                  closed

                                                                                                                                                  tim

                                                                                                                                                  ed w

                                                                                                                                                  ait

                                                                                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                  Step 3 client receives FIN replies with ACK

                                                                                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                  Closes down after timed-wait

                                                                                                                                                  Step 4 server receives ACK Connection closed

                                                                                                                                                  Note with small modification can handle simultaneous FINs

                                                                                                                                                  client

                                                                                                                                                  FIN

                                                                                                                                                  server

                                                                                                                                                  ACK

                                                                                                                                                  ACK

                                                                                                                                                  FIN

                                                                                                                                                  closing

                                                                                                                                                  closing

                                                                                                                                                  closed

                                                                                                                                                  tim

                                                                                                                                                  ed w

                                                                                                                                                  ait

                                                                                                                                                  closed

                                                                                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                  ExampleTCP serverlifecycle

                                                                                                                                                  Example TCP clientlifecycle

                                                                                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                  A few special cases

                                                                                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                  Chapter 3 outline

                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                  Principles of Congestion Control

                                                                                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                  a top-10 problem

                                                                                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                  large delays when congestedmaximum achievable throughput

                                                                                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                  Causescosts of congestion scenario 2

                                                                                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                  λin λout=

                                                                                                                                                  λin λoutgtλ

                                                                                                                                                  inλout

                                                                                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                  (c)(a) (b)

                                                                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                  λin

                                                                                                                                                  Q what happens as and increase λ

                                                                                                                                                  in

                                                                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                  Causescosts of congestion scenario 3

                                                                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                  Approaches towards congestion control

                                                                                                                                                  Two broad approaches towards congestion control

                                                                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                                                                  small exception ndash see next page

                                                                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                  sender should use available bandwidth

                                                                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                  Chapter 3 outline

                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                  Congwin

                                                                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                  cut CongWin in half after loss event

                                                                                                                                                  8 Kbytes

                                                                                                                                                  16 Kbytes

                                                                                                                                                  24 Kbytes

                                                                                                                                                  time

                                                                                                                                                  congestionwindow

                                                                                                                                                  Long-lived TCP connection

                                                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                  TCP Slow Start

                                                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                  TCP Slow Start (more)

                                                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                  Host A

                                                                                                                                                  one segment

                                                                                                                                                  RTT

                                                                                                                                                  Host B

                                                                                                                                                  time

                                                                                                                                                  two segments

                                                                                                                                                  four segments

                                                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                  Summary TCP Congestion Control

                                                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                  The Big Picture

                                                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                  Slow Start (SS)

                                                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                  CongestionAvoidance (CA)

                                                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                  Enter slow start

                                                                                                                                                  Duplicate ACK

                                                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                  CongWin and Threshold not changed

                                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                  TCP throughput

                                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                  TCP Futures

                                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                  LRTTMSSsdot221

                                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                  TCP connection 1

                                                                                                                                                  bottleneckrouter

                                                                                                                                                  capacity R

                                                                                                                                                  TCP connection 2

                                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                  R

                                                                                                                                                  R

                                                                                                                                                  equal bandwidth share

                                                                                                                                                  Connection 1 throughput

                                                                                                                                                  Conn

                                                                                                                                                  ecti

                                                                                                                                                  on 2

                                                                                                                                                  thr

                                                                                                                                                  ough

                                                                                                                                                  p ut

                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                  modeling slow start

                                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                  Fixed congestion window (1)

                                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                  latency = 2RTT + OR

                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                  RS

                                                                                                                                                  RSRTTP

                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                  RTT

                                                                                                                                                  initiate TCPconnection

                                                                                                                                                  requestobject

                                                                                                                                                  first window= SR

                                                                                                                                                  second window= 2SR

                                                                                                                                                  third window= 4SR

                                                                                                                                                  fourth window= 8SR

                                                                                                                                                  completetransmissionobject

                                                                                                                                                  delivered

                                                                                                                                                  time atclient

                                                                                                                                                  time atserver

                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                  Server idles P=2 times

                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                  RS

                                                                                                                                                  RSRTTPRTT

                                                                                                                                                  RO

                                                                                                                                                  RSRTT

                                                                                                                                                  RSRTT

                                                                                                                                                  RO

                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                  P

                                                                                                                                                  kP

                                                                                                                                                  k

                                                                                                                                                  P

                                                                                                                                                  pp

                                                                                                                                                  )12(][2

                                                                                                                                                  ]2[2

                                                                                                                                                  2delay

                                                                                                                                                  1

                                                                                                                                                  1

                                                                                                                                                  1

                                                                                                                                                  minusminus+++=

                                                                                                                                                  minus+++=

                                                                                                                                                  ++=

                                                                                                                                                  minus

                                                                                                                                                  =

                                                                                                                                                  =

                                                                                                                                                  sum

                                                                                                                                                  sum

                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                  +minus

                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                  RSk

                                                                                                                                                  RTT

                                                                                                                                                  initiate TCPconnection

                                                                                                                                                  requestobject

                                                                                                                                                  first window= SR

                                                                                                                                                  second window= 2SR

                                                                                                                                                  third window= 4SR

                                                                                                                                                  fourth window= 8SR

                                                                                                                                                  completetransmissionobject

                                                                                                                                                  delivered

                                                                                                                                                  time atclient

                                                                                                                                                  time atserver

                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                  How do we calculate K

                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                  +ge=

                                                                                                                                                  geminus=

                                                                                                                                                  ge+++=

                                                                                                                                                  ge+++=minus

                                                                                                                                                  minus

                                                                                                                                                  )1(log

                                                                                                                                                  )1(logmin

                                                                                                                                                  12min

                                                                                                                                                  222min222min

                                                                                                                                                  2

                                                                                                                                                  2

                                                                                                                                                  110

                                                                                                                                                  110

                                                                                                                                                  SO

                                                                                                                                                  SOkk

                                                                                                                                                  SOk

                                                                                                                                                  SOkOSSSkK

                                                                                                                                                  k

                                                                                                                                                  k

                                                                                                                                                  k

                                                                                                                                                  L

                                                                                                                                                  L

                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                  02468

                                                                                                                                                  101214161820

                                                                                                                                                  28Kbps

                                                                                                                                                  100Kbps

                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                  non-persistent

                                                                                                                                                  persistent

                                                                                                                                                  parallel non-persistent

                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                  0

                                                                                                                                                  10

                                                                                                                                                  20

                                                                                                                                                  30

                                                                                                                                                  40

                                                                                                                                                  50

                                                                                                                                                  60

                                                                                                                                                  70

                                                                                                                                                  28Kbps

                                                                                                                                                  100Kbps

                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                  non-persistent

                                                                                                                                                  persistent

                                                                                                                                                  parallel non-persistent

                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                  UDPTCP

                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • Transport services and protocols
                                                                                                                                                  • Transport vs network layer
                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                  • How demultiplexing works
                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                  • UDP more
                                                                                                                                                  • UDP checksum
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                  • Incremental Improvements
                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                  • rdt21 discussion
                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                  • rdt30 sender
                                                                                                                                                  • rdt30 in action
                                                                                                                                                  • rdt30 in action
                                                                                                                                                  • Performance of rdt30
                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                  • Pipelined protocols
                                                                                                                                                  • Pipelined protocols
                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                  • Go-Back-N
                                                                                                                                                  • GBN Sender
                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                  • More on receiver
                                                                                                                                                  • GBN inaction
                                                                                                                                                  • Selective Repeat
                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                  • Selective repeat
                                                                                                                                                  • Selective repeat in action
                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                  • More TCP Details
                                                                                                                                                  • Even More TCP Details
                                                                                                                                                  • TCP segment structure
                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                  • Example RTT estimation
                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                  • TCP sender events
                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                  • More on Sender Policies
                                                                                                                                                  • Fast Retransmit
                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • TCP Flow Control
                                                                                                                                                  • TCP Flow Control
                                                                                                                                                  • TCP segment structure
                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                  • Technical Issue
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • TCP Connection Management
                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                  • A few special cases
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                  • TCP AIMD
                                                                                                                                                  • TCP Slow Start
                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                  • The Big Picture
                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                  • TCP throughput
                                                                                                                                                  • TCP Futures
                                                                                                                                                  • TCP Fairness
                                                                                                                                                  • Why is TCP fair
                                                                                                                                                  • Fairness (more)
                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                  • HTTP Modeling
                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                    3 Transport Layer 74Comp 361 Spring 2005

                                                                                                                                                    More on Sender Policies

                                                                                                                                                    Doubling the Timeout IntervalUsed by most TCP implementationsIf timeout occurs then after retransmisison Timeout Interval is doubledIntervals grow exponentially with each consecutive timeoutWhen Timer restarted because of (i) new data from above or (ii) ACK received then Timeout Interval is reset as described previously using Estimated RTT and DevRTTLimited form of Congestion Control

                                                                                                                                                    3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                                    Fast Retransmit

                                                                                                                                                    Time-out period often relatively long

                                                                                                                                                    long delay before resending lost packet

                                                                                                                                                    Detect lost segments via duplicate ACKs

                                                                                                                                                    Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                                    If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                                    fast retransmit resend segment before timer expires

                                                                                                                                                    3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                                    Fast retransmit algorithm

                                                                                                                                                    event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                                    SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                                    start timer

                                                                                                                                                    else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                                    resend segment with sequence number y

                                                                                                                                                    a duplicate ACK for already ACKed segment

                                                                                                                                                    fast retransmit

                                                                                                                                                    3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                                    TCP GBN or Selective Repeat

                                                                                                                                                    Basic TCP looks a lot like GBN

                                                                                                                                                    Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                                    This looks a lot like Selective Repeat

                                                                                                                                                    TCP is a hybrid

                                                                                                                                                    3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                    Chapter 3 outline

                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                    3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                    TCP Flow Control

                                                                                                                                                    Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                    3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                    TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                    transmitting too muchtoo fast

                                                                                                                                                    flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                    speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                    app process may be slow at reading from buffer

                                                                                                                                                    3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                    TCP segment structure

                                                                                                                                                    source port dest port

                                                                                                                                                    32 bits

                                                                                                                                                    applicationdata

                                                                                                                                                    (variable length)

                                                                                                                                                    sequence numberacknowledgement number

                                                                                                                                                    Receive windowUrg data pnterchecksum

                                                                                                                                                    FSRPAUheadlen

                                                                                                                                                    notused

                                                                                                                                                    Options (variable length)

                                                                                                                                                    URG urgent data (generally not used)

                                                                                                                                                    ACK ACK valid

                                                                                                                                                    PSH push data now(generally not used)

                                                                                                                                                    RST SYN FINconnection estab(setup teardown

                                                                                                                                                    commands)

                                                                                                                                                    bytes rcvr willingto accept

                                                                                                                                                    Internetchecksum

                                                                                                                                                    (as in UDP)

                                                                                                                                                    countingby bytes of data(not segments)

                                                                                                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                    TCP Flow control how it works

                                                                                                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                    LastByteRead]

                                                                                                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                    guarantees receive buffer doesnrsquot overflow

                                                                                                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                    Technical Issue

                                                                                                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                    Note on UDP

                                                                                                                                                    UDP has no flow control

                                                                                                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                    Chapter 3 outline

                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                    TCP Connection Management

                                                                                                                                                    Three way handshakeStep 1 client end system sends

                                                                                                                                                    TCP SYN control segment to server

                                                                                                                                                    specifies client_isn the initial seq No application data

                                                                                                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                    Allocate buffersAllocates buffersCan include application data

                                                                                                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                    server

                                                                                                                                                    Connection granted (SYN=1 server_isn

                                                                                                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                    ack=client_isn+1)

                                                                                                                                                    ack=server_isn+1

                                                                                                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                    Closing a connection

                                                                                                                                                    client closes socketclientSocketclose()

                                                                                                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                    client

                                                                                                                                                    FIN

                                                                                                                                                    server

                                                                                                                                                    ACK

                                                                                                                                                    ACK

                                                                                                                                                    FIN

                                                                                                                                                    close

                                                                                                                                                    close

                                                                                                                                                    closed

                                                                                                                                                    tim

                                                                                                                                                    ed w

                                                                                                                                                    ait

                                                                                                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                    Step 3 client receives FIN replies with ACK

                                                                                                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                    Closes down after timed-wait

                                                                                                                                                    Step 4 server receives ACK Connection closed

                                                                                                                                                    Note with small modification can handle simultaneous FINs

                                                                                                                                                    client

                                                                                                                                                    FIN

                                                                                                                                                    server

                                                                                                                                                    ACK

                                                                                                                                                    ACK

                                                                                                                                                    FIN

                                                                                                                                                    closing

                                                                                                                                                    closing

                                                                                                                                                    closed

                                                                                                                                                    tim

                                                                                                                                                    ed w

                                                                                                                                                    ait

                                                                                                                                                    closed

                                                                                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                    ExampleTCP serverlifecycle

                                                                                                                                                    Example TCP clientlifecycle

                                                                                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                    A few special cases

                                                                                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                    Chapter 3 outline

                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                    Principles of Congestion Control

                                                                                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                    a top-10 problem

                                                                                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                    large delays when congestedmaximum achievable throughput

                                                                                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                    Causescosts of congestion scenario 2

                                                                                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                    λin λout=

                                                                                                                                                    λin λoutgtλ

                                                                                                                                                    inλout

                                                                                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                    (c)(a) (b)

                                                                                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                    λin

                                                                                                                                                    Q what happens as and increase λ

                                                                                                                                                    in

                                                                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                    Causescosts of congestion scenario 3

                                                                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                    Approaches towards congestion control

                                                                                                                                                    Two broad approaches towards congestion control

                                                                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                                                                    small exception ndash see next page

                                                                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                    sender should use available bandwidth

                                                                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                    Chapter 3 outline

                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                    Congwin

                                                                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                    cut CongWin in half after loss event

                                                                                                                                                    8 Kbytes

                                                                                                                                                    16 Kbytes

                                                                                                                                                    24 Kbytes

                                                                                                                                                    time

                                                                                                                                                    congestionwindow

                                                                                                                                                    Long-lived TCP connection

                                                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                    TCP Slow Start

                                                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                    TCP Slow Start (more)

                                                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                    Host A

                                                                                                                                                    one segment

                                                                                                                                                    RTT

                                                                                                                                                    Host B

                                                                                                                                                    time

                                                                                                                                                    two segments

                                                                                                                                                    four segments

                                                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                    Summary TCP Congestion Control

                                                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                    The Big Picture

                                                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                    Slow Start (SS)

                                                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                    CongestionAvoidance (CA)

                                                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                    Enter slow start

                                                                                                                                                    Duplicate ACK

                                                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                    CongWin and Threshold not changed

                                                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                    TCP throughput

                                                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                    TCP Futures

                                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                    LRTTMSSsdot221

                                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                    TCP connection 1

                                                                                                                                                    bottleneckrouter

                                                                                                                                                    capacity R

                                                                                                                                                    TCP connection 2

                                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                    R

                                                                                                                                                    R

                                                                                                                                                    equal bandwidth share

                                                                                                                                                    Connection 1 throughput

                                                                                                                                                    Conn

                                                                                                                                                    ecti

                                                                                                                                                    on 2

                                                                                                                                                    thr

                                                                                                                                                    ough

                                                                                                                                                    p ut

                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                    modeling slow start

                                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                    Fixed congestion window (1)

                                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                    latency = 2RTT + OR

                                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                    Fixed congestion window (2)

                                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                    RS

                                                                                                                                                    RSRTTP

                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                    RTT

                                                                                                                                                    initiate TCPconnection

                                                                                                                                                    requestobject

                                                                                                                                                    first window= SR

                                                                                                                                                    second window= 2SR

                                                                                                                                                    third window= 4SR

                                                                                                                                                    fourth window= 8SR

                                                                                                                                                    completetransmissionobject

                                                                                                                                                    delivered

                                                                                                                                                    time atclient

                                                                                                                                                    time atserver

                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                    Server idles P=2 times

                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                    RS

                                                                                                                                                    RSRTTPRTT

                                                                                                                                                    RO

                                                                                                                                                    RSRTT

                                                                                                                                                    RSRTT

                                                                                                                                                    RO

                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                    P

                                                                                                                                                    kP

                                                                                                                                                    k

                                                                                                                                                    P

                                                                                                                                                    pp

                                                                                                                                                    )12(][2

                                                                                                                                                    ]2[2

                                                                                                                                                    2delay

                                                                                                                                                    1

                                                                                                                                                    1

                                                                                                                                                    1

                                                                                                                                                    minusminus+++=

                                                                                                                                                    minus+++=

                                                                                                                                                    ++=

                                                                                                                                                    minus

                                                                                                                                                    =

                                                                                                                                                    =

                                                                                                                                                    sum

                                                                                                                                                    sum

                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                    +minus

                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                    RSk

                                                                                                                                                    RTT

                                                                                                                                                    initiate TCPconnection

                                                                                                                                                    requestobject

                                                                                                                                                    first window= SR

                                                                                                                                                    second window= 2SR

                                                                                                                                                    third window= 4SR

                                                                                                                                                    fourth window= 8SR

                                                                                                                                                    completetransmissionobject

                                                                                                                                                    delivered

                                                                                                                                                    time atclient

                                                                                                                                                    time atserver

                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                    How do we calculate K

                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                    +ge=

                                                                                                                                                    geminus=

                                                                                                                                                    ge+++=

                                                                                                                                                    ge+++=minus

                                                                                                                                                    minus

                                                                                                                                                    )1(log

                                                                                                                                                    )1(logmin

                                                                                                                                                    12min

                                                                                                                                                    222min222min

                                                                                                                                                    2

                                                                                                                                                    2

                                                                                                                                                    110

                                                                                                                                                    110

                                                                                                                                                    SO

                                                                                                                                                    SOkk

                                                                                                                                                    SOk

                                                                                                                                                    SOkOSSSkK

                                                                                                                                                    k

                                                                                                                                                    k

                                                                                                                                                    k

                                                                                                                                                    L

                                                                                                                                                    L

                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                    02468

                                                                                                                                                    101214161820

                                                                                                                                                    28Kbps

                                                                                                                                                    100Kbps

                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                    non-persistent

                                                                                                                                                    persistent

                                                                                                                                                    parallel non-persistent

                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                    0

                                                                                                                                                    10

                                                                                                                                                    20

                                                                                                                                                    30

                                                                                                                                                    40

                                                                                                                                                    50

                                                                                                                                                    60

                                                                                                                                                    70

                                                                                                                                                    28Kbps

                                                                                                                                                    100Kbps

                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                    non-persistent

                                                                                                                                                    persistent

                                                                                                                                                    parallel non-persistent

                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                    UDPTCP

                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • Transport services and protocols
                                                                                                                                                    • Transport vs network layer
                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                    • How demultiplexing works
                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                    • UDP more
                                                                                                                                                    • UDP checksum
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                    • Incremental Improvements
                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                    • rdt21 discussion
                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                    • rdt30 sender
                                                                                                                                                    • rdt30 in action
                                                                                                                                                    • rdt30 in action
                                                                                                                                                    • Performance of rdt30
                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                    • Pipelined protocols
                                                                                                                                                    • Pipelined protocols
                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                    • Go-Back-N
                                                                                                                                                    • GBN Sender
                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                    • More on receiver
                                                                                                                                                    • GBN inaction
                                                                                                                                                    • Selective Repeat
                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                    • Selective repeat
                                                                                                                                                    • Selective repeat in action
                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                    • More TCP Details
                                                                                                                                                    • Even More TCP Details
                                                                                                                                                    • TCP segment structure
                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                    • Example RTT estimation
                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                    • TCP sender events
                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                    • More on Sender Policies
                                                                                                                                                    • Fast Retransmit
                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • TCP Flow Control
                                                                                                                                                    • TCP Flow Control
                                                                                                                                                    • TCP segment structure
                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                    • Technical Issue
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • TCP Connection Management
                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                    • A few special cases
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                    • TCP AIMD
                                                                                                                                                    • TCP Slow Start
                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                    • The Big Picture
                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                    • TCP throughput
                                                                                                                                                    • TCP Futures
                                                                                                                                                    • TCP Fairness
                                                                                                                                                    • Why is TCP fair
                                                                                                                                                    • Fairness (more)
                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                    • HTTP Modeling
                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                      3 Transport Layer 75Comp 361 Spring 2005

                                                                                                                                                      Fast Retransmit

                                                                                                                                                      Time-out period often relatively long

                                                                                                                                                      long delay before resending lost packet

                                                                                                                                                      Detect lost segments via duplicate ACKs

                                                                                                                                                      Sender often sends many segments back-to-backIf segment is lost there will likely be many duplicate ACKs

                                                                                                                                                      If sender receives 3 ACKs for the same data it supposes that segment after ACKeddata was lost

                                                                                                                                                      fast retransmit resend segment before timer expires

                                                                                                                                                      3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                                      Fast retransmit algorithm

                                                                                                                                                      event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                                      SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                                      start timer

                                                                                                                                                      else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                                      resend segment with sequence number y

                                                                                                                                                      a duplicate ACK for already ACKed segment

                                                                                                                                                      fast retransmit

                                                                                                                                                      3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                                      TCP GBN or Selective Repeat

                                                                                                                                                      Basic TCP looks a lot like GBN

                                                                                                                                                      Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                                      This looks a lot like Selective Repeat

                                                                                                                                                      TCP is a hybrid

                                                                                                                                                      3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                      Chapter 3 outline

                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                      3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                      TCP Flow Control

                                                                                                                                                      Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                      3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                      TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                      transmitting too muchtoo fast

                                                                                                                                                      flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                      speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                      app process may be slow at reading from buffer

                                                                                                                                                      3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                      TCP segment structure

                                                                                                                                                      source port dest port

                                                                                                                                                      32 bits

                                                                                                                                                      applicationdata

                                                                                                                                                      (variable length)

                                                                                                                                                      sequence numberacknowledgement number

                                                                                                                                                      Receive windowUrg data pnterchecksum

                                                                                                                                                      FSRPAUheadlen

                                                                                                                                                      notused

                                                                                                                                                      Options (variable length)

                                                                                                                                                      URG urgent data (generally not used)

                                                                                                                                                      ACK ACK valid

                                                                                                                                                      PSH push data now(generally not used)

                                                                                                                                                      RST SYN FINconnection estab(setup teardown

                                                                                                                                                      commands)

                                                                                                                                                      bytes rcvr willingto accept

                                                                                                                                                      Internetchecksum

                                                                                                                                                      (as in UDP)

                                                                                                                                                      countingby bytes of data(not segments)

                                                                                                                                                      3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                      TCP Flow control how it works

                                                                                                                                                      (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                      = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                      LastByteRead]

                                                                                                                                                      Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                      guarantees receive buffer doesnrsquot overflow

                                                                                                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                      Technical Issue

                                                                                                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                      Note on UDP

                                                                                                                                                      UDP has no flow control

                                                                                                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                      Chapter 3 outline

                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                      TCP Connection Management

                                                                                                                                                      Three way handshakeStep 1 client end system sends

                                                                                                                                                      TCP SYN control segment to server

                                                                                                                                                      specifies client_isn the initial seq No application data

                                                                                                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                      Allocate buffersAllocates buffersCan include application data

                                                                                                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                      server

                                                                                                                                                      Connection granted (SYN=1 server_isn

                                                                                                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                      ack=client_isn+1)

                                                                                                                                                      ack=server_isn+1

                                                                                                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                      Closing a connection

                                                                                                                                                      client closes socketclientSocketclose()

                                                                                                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                      client

                                                                                                                                                      FIN

                                                                                                                                                      server

                                                                                                                                                      ACK

                                                                                                                                                      ACK

                                                                                                                                                      FIN

                                                                                                                                                      close

                                                                                                                                                      close

                                                                                                                                                      closed

                                                                                                                                                      tim

                                                                                                                                                      ed w

                                                                                                                                                      ait

                                                                                                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                      Step 3 client receives FIN replies with ACK

                                                                                                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                      Closes down after timed-wait

                                                                                                                                                      Step 4 server receives ACK Connection closed

                                                                                                                                                      Note with small modification can handle simultaneous FINs

                                                                                                                                                      client

                                                                                                                                                      FIN

                                                                                                                                                      server

                                                                                                                                                      ACK

                                                                                                                                                      ACK

                                                                                                                                                      FIN

                                                                                                                                                      closing

                                                                                                                                                      closing

                                                                                                                                                      closed

                                                                                                                                                      tim

                                                                                                                                                      ed w

                                                                                                                                                      ait

                                                                                                                                                      closed

                                                                                                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                      ExampleTCP serverlifecycle

                                                                                                                                                      Example TCP clientlifecycle

                                                                                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                      A few special cases

                                                                                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                      Chapter 3 outline

                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                      Principles of Congestion Control

                                                                                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                      a top-10 problem

                                                                                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                      large delays when congestedmaximum achievable throughput

                                                                                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                      Causescosts of congestion scenario 2

                                                                                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                      λin λout=

                                                                                                                                                      λin λoutgtλ

                                                                                                                                                      inλout

                                                                                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                      (c)(a) (b)

                                                                                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                      λin

                                                                                                                                                      Q what happens as and increase λ

                                                                                                                                                      in

                                                                                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                      Causescosts of congestion scenario 3

                                                                                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                      Approaches towards congestion control

                                                                                                                                                      Two broad approaches towards congestion control

                                                                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                                                                      small exception ndash see next page

                                                                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                      sender should use available bandwidth

                                                                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                      Chapter 3 outline

                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                      Congwin

                                                                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                      cut CongWin in half after loss event

                                                                                                                                                      8 Kbytes

                                                                                                                                                      16 Kbytes

                                                                                                                                                      24 Kbytes

                                                                                                                                                      time

                                                                                                                                                      congestionwindow

                                                                                                                                                      Long-lived TCP connection

                                                                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                      TCP Slow Start

                                                                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                      TCP Slow Start (more)

                                                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                      Host A

                                                                                                                                                      one segment

                                                                                                                                                      RTT

                                                                                                                                                      Host B

                                                                                                                                                      time

                                                                                                                                                      two segments

                                                                                                                                                      four segments

                                                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                      Summary TCP Congestion Control

                                                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                      The Big Picture

                                                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                      Slow Start (SS)

                                                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                      CongestionAvoidance (CA)

                                                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                      Enter slow start

                                                                                                                                                      Duplicate ACK

                                                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                      CongWin and Threshold not changed

                                                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                      TCP throughput

                                                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                      TCP Futures

                                                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                      LRTTMSSsdot221

                                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                      TCP connection 1

                                                                                                                                                      bottleneckrouter

                                                                                                                                                      capacity R

                                                                                                                                                      TCP connection 2

                                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                      R

                                                                                                                                                      R

                                                                                                                                                      equal bandwidth share

                                                                                                                                                      Connection 1 throughput

                                                                                                                                                      Conn

                                                                                                                                                      ecti

                                                                                                                                                      on 2

                                                                                                                                                      thr

                                                                                                                                                      ough

                                                                                                                                                      p ut

                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                      modeling slow start

                                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                      Fixed congestion window (1)

                                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                      latency = 2RTT + OR

                                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                      Fixed congestion window (2)

                                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                      Will show that the delay for one object is

                                                                                                                                                      RS

                                                                                                                                                      RSRTTP

                                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                      RTT

                                                                                                                                                      initiate TCPconnection

                                                                                                                                                      requestobject

                                                                                                                                                      first window= SR

                                                                                                                                                      second window= 2SR

                                                                                                                                                      third window= 4SR

                                                                                                                                                      fourth window= 8SR

                                                                                                                                                      completetransmissionobject

                                                                                                                                                      delivered

                                                                                                                                                      time atclient

                                                                                                                                                      time atserver

                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                      Server idles P=2 times

                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                      RS

                                                                                                                                                      RSRTTPRTT

                                                                                                                                                      RO

                                                                                                                                                      RSRTT

                                                                                                                                                      RSRTT

                                                                                                                                                      RO

                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                      P

                                                                                                                                                      kP

                                                                                                                                                      k

                                                                                                                                                      P

                                                                                                                                                      pp

                                                                                                                                                      )12(][2

                                                                                                                                                      ]2[2

                                                                                                                                                      2delay

                                                                                                                                                      1

                                                                                                                                                      1

                                                                                                                                                      1

                                                                                                                                                      minusminus+++=

                                                                                                                                                      minus+++=

                                                                                                                                                      ++=

                                                                                                                                                      minus

                                                                                                                                                      =

                                                                                                                                                      =

                                                                                                                                                      sum

                                                                                                                                                      sum

                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                      +minus

                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                      RSk

                                                                                                                                                      RTT

                                                                                                                                                      initiate TCPconnection

                                                                                                                                                      requestobject

                                                                                                                                                      first window= SR

                                                                                                                                                      second window= 2SR

                                                                                                                                                      third window= 4SR

                                                                                                                                                      fourth window= 8SR

                                                                                                                                                      completetransmissionobject

                                                                                                                                                      delivered

                                                                                                                                                      time atclient

                                                                                                                                                      time atserver

                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                      How do we calculate K

                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                      +ge=

                                                                                                                                                      geminus=

                                                                                                                                                      ge+++=

                                                                                                                                                      ge+++=minus

                                                                                                                                                      minus

                                                                                                                                                      )1(log

                                                                                                                                                      )1(logmin

                                                                                                                                                      12min

                                                                                                                                                      222min222min

                                                                                                                                                      2

                                                                                                                                                      2

                                                                                                                                                      110

                                                                                                                                                      110

                                                                                                                                                      SO

                                                                                                                                                      SOkk

                                                                                                                                                      SOk

                                                                                                                                                      SOkOSSSkK

                                                                                                                                                      k

                                                                                                                                                      k

                                                                                                                                                      k

                                                                                                                                                      L

                                                                                                                                                      L

                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                      02468

                                                                                                                                                      101214161820

                                                                                                                                                      28Kbps

                                                                                                                                                      100Kbps

                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                      non-persistent

                                                                                                                                                      persistent

                                                                                                                                                      parallel non-persistent

                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                      0

                                                                                                                                                      10

                                                                                                                                                      20

                                                                                                                                                      30

                                                                                                                                                      40

                                                                                                                                                      50

                                                                                                                                                      60

                                                                                                                                                      70

                                                                                                                                                      28Kbps

                                                                                                                                                      100Kbps

                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                      non-persistent

                                                                                                                                                      persistent

                                                                                                                                                      parallel non-persistent

                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                      UDPTCP

                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • Transport services and protocols
                                                                                                                                                      • Transport vs network layer
                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                      • How demultiplexing works
                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                      • UDP more
                                                                                                                                                      • UDP checksum
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                      • Incremental Improvements
                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                      • rdt21 discussion
                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                      • rdt30 sender
                                                                                                                                                      • rdt30 in action
                                                                                                                                                      • rdt30 in action
                                                                                                                                                      • Performance of rdt30
                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                      • Pipelined protocols
                                                                                                                                                      • Pipelined protocols
                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                      • Go-Back-N
                                                                                                                                                      • GBN Sender
                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                      • More on receiver
                                                                                                                                                      • GBN inaction
                                                                                                                                                      • Selective Repeat
                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                      • Selective repeat
                                                                                                                                                      • Selective repeat in action
                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                      • More TCP Details
                                                                                                                                                      • Even More TCP Details
                                                                                                                                                      • TCP segment structure
                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                      • Example RTT estimation
                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                      • TCP sender events
                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                      • More on Sender Policies
                                                                                                                                                      • Fast Retransmit
                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • TCP Flow Control
                                                                                                                                                      • TCP Flow Control
                                                                                                                                                      • TCP segment structure
                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                      • Technical Issue
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • TCP Connection Management
                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                      • A few special cases
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                      • TCP AIMD
                                                                                                                                                      • TCP Slow Start
                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                      • The Big Picture
                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                      • TCP throughput
                                                                                                                                                      • TCP Futures
                                                                                                                                                      • TCP Fairness
                                                                                                                                                      • Why is TCP fair
                                                                                                                                                      • Fairness (more)
                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                      • HTTP Modeling
                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                        3 Transport Layer 76Comp 361 Spring 2005

                                                                                                                                                        Fast retransmit algorithm

                                                                                                                                                        event ACK received with ACK field value of y if (y gt SendBase)

                                                                                                                                                        SendBase = yif (there are currently not-yet-acknowledged segments)

                                                                                                                                                        start timer

                                                                                                                                                        else increment count of dup ACKs received for yif (count of dup ACKs received for y = 3)

                                                                                                                                                        resend segment with sequence number y

                                                                                                                                                        a duplicate ACK for already ACKed segment

                                                                                                                                                        fast retransmit

                                                                                                                                                        3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                                        TCP GBN or Selective Repeat

                                                                                                                                                        Basic TCP looks a lot like GBN

                                                                                                                                                        Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                                        This looks a lot like Selective Repeat

                                                                                                                                                        TCP is a hybrid

                                                                                                                                                        3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                        Chapter 3 outline

                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                        3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                        TCP Flow Control

                                                                                                                                                        Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                        3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                        TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                        transmitting too muchtoo fast

                                                                                                                                                        flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                        speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                        app process may be slow at reading from buffer

                                                                                                                                                        3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                        TCP segment structure

                                                                                                                                                        source port dest port

                                                                                                                                                        32 bits

                                                                                                                                                        applicationdata

                                                                                                                                                        (variable length)

                                                                                                                                                        sequence numberacknowledgement number

                                                                                                                                                        Receive windowUrg data pnterchecksum

                                                                                                                                                        FSRPAUheadlen

                                                                                                                                                        notused

                                                                                                                                                        Options (variable length)

                                                                                                                                                        URG urgent data (generally not used)

                                                                                                                                                        ACK ACK valid

                                                                                                                                                        PSH push data now(generally not used)

                                                                                                                                                        RST SYN FINconnection estab(setup teardown

                                                                                                                                                        commands)

                                                                                                                                                        bytes rcvr willingto accept

                                                                                                                                                        Internetchecksum

                                                                                                                                                        (as in UDP)

                                                                                                                                                        countingby bytes of data(not segments)

                                                                                                                                                        3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                        TCP Flow control how it works

                                                                                                                                                        (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                        = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                        LastByteRead]

                                                                                                                                                        Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                        guarantees receive buffer doesnrsquot overflow

                                                                                                                                                        3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                        Technical Issue

                                                                                                                                                        Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                        Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                        Note on UDP

                                                                                                                                                        UDP has no flow control

                                                                                                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                        Chapter 3 outline

                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                        TCP Connection Management

                                                                                                                                                        Three way handshakeStep 1 client end system sends

                                                                                                                                                        TCP SYN control segment to server

                                                                                                                                                        specifies client_isn the initial seq No application data

                                                                                                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                        Allocate buffersAllocates buffersCan include application data

                                                                                                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                        server

                                                                                                                                                        Connection granted (SYN=1 server_isn

                                                                                                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                        ack=client_isn+1)

                                                                                                                                                        ack=server_isn+1

                                                                                                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                        Closing a connection

                                                                                                                                                        client closes socketclientSocketclose()

                                                                                                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                        client

                                                                                                                                                        FIN

                                                                                                                                                        server

                                                                                                                                                        ACK

                                                                                                                                                        ACK

                                                                                                                                                        FIN

                                                                                                                                                        close

                                                                                                                                                        close

                                                                                                                                                        closed

                                                                                                                                                        tim

                                                                                                                                                        ed w

                                                                                                                                                        ait

                                                                                                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                        Step 3 client receives FIN replies with ACK

                                                                                                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                        Closes down after timed-wait

                                                                                                                                                        Step 4 server receives ACK Connection closed

                                                                                                                                                        Note with small modification can handle simultaneous FINs

                                                                                                                                                        client

                                                                                                                                                        FIN

                                                                                                                                                        server

                                                                                                                                                        ACK

                                                                                                                                                        ACK

                                                                                                                                                        FIN

                                                                                                                                                        closing

                                                                                                                                                        closing

                                                                                                                                                        closed

                                                                                                                                                        tim

                                                                                                                                                        ed w

                                                                                                                                                        ait

                                                                                                                                                        closed

                                                                                                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                        ExampleTCP serverlifecycle

                                                                                                                                                        Example TCP clientlifecycle

                                                                                                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                        A few special cases

                                                                                                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                        Chapter 3 outline

                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                        Principles of Congestion Control

                                                                                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                        a top-10 problem

                                                                                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                        large delays when congestedmaximum achievable throughput

                                                                                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                        Causescosts of congestion scenario 2

                                                                                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                        λin λout=

                                                                                                                                                        λin λoutgtλ

                                                                                                                                                        inλout

                                                                                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                        (c)(a) (b)

                                                                                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                        λin

                                                                                                                                                        Q what happens as and increase λ

                                                                                                                                                        in

                                                                                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                        Causescosts of congestion scenario 3

                                                                                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                        Approaches towards congestion control

                                                                                                                                                        Two broad approaches towards congestion control

                                                                                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                                                                        small exception ndash see next page

                                                                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                        sender should use available bandwidth

                                                                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                        Chapter 3 outline

                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                        Congwin

                                                                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                        cut CongWin in half after loss event

                                                                                                                                                        8 Kbytes

                                                                                                                                                        16 Kbytes

                                                                                                                                                        24 Kbytes

                                                                                                                                                        time

                                                                                                                                                        congestionwindow

                                                                                                                                                        Long-lived TCP connection

                                                                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                        TCP Slow Start

                                                                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                        TCP Slow Start (more)

                                                                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                        Host A

                                                                                                                                                        one segment

                                                                                                                                                        RTT

                                                                                                                                                        Host B

                                                                                                                                                        time

                                                                                                                                                        two segments

                                                                                                                                                        four segments

                                                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                        Summary TCP Congestion Control

                                                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                        The Big Picture

                                                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                        Slow Start (SS)

                                                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                        CongestionAvoidance (CA)

                                                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                        Enter slow start

                                                                                                                                                        Duplicate ACK

                                                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                        CongWin and Threshold not changed

                                                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                        TCP throughput

                                                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                        TCP Futures

                                                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                        LRTTMSSsdot221

                                                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                        TCP connection 1

                                                                                                                                                        bottleneckrouter

                                                                                                                                                        capacity R

                                                                                                                                                        TCP connection 2

                                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                        R

                                                                                                                                                        R

                                                                                                                                                        equal bandwidth share

                                                                                                                                                        Connection 1 throughput

                                                                                                                                                        Conn

                                                                                                                                                        ecti

                                                                                                                                                        on 2

                                                                                                                                                        thr

                                                                                                                                                        ough

                                                                                                                                                        p ut

                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                        modeling slow start

                                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                        Fixed congestion window (1)

                                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                        latency = 2RTT + OR

                                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                        Fixed congestion window (2)

                                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                        Will show that the delay for one object is

                                                                                                                                                        RS

                                                                                                                                                        RSRTTP

                                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                                        RTT

                                                                                                                                                        initiate TCPconnection

                                                                                                                                                        requestobject

                                                                                                                                                        first window= SR

                                                                                                                                                        second window= 2SR

                                                                                                                                                        third window= 4SR

                                                                                                                                                        fourth window= 8SR

                                                                                                                                                        completetransmissionobject

                                                                                                                                                        delivered

                                                                                                                                                        time atclient

                                                                                                                                                        time atserver

                                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                        Server idles P=2 times

                                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                        RS

                                                                                                                                                        RSRTTPRTT

                                                                                                                                                        RO

                                                                                                                                                        RSRTT

                                                                                                                                                        RSRTT

                                                                                                                                                        RO

                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                        P

                                                                                                                                                        kP

                                                                                                                                                        k

                                                                                                                                                        P

                                                                                                                                                        pp

                                                                                                                                                        )12(][2

                                                                                                                                                        ]2[2

                                                                                                                                                        2delay

                                                                                                                                                        1

                                                                                                                                                        1

                                                                                                                                                        1

                                                                                                                                                        minusminus+++=

                                                                                                                                                        minus+++=

                                                                                                                                                        ++=

                                                                                                                                                        minus

                                                                                                                                                        =

                                                                                                                                                        =

                                                                                                                                                        sum

                                                                                                                                                        sum

                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                        +minus

                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                        RSk

                                                                                                                                                        RTT

                                                                                                                                                        initiate TCPconnection

                                                                                                                                                        requestobject

                                                                                                                                                        first window= SR

                                                                                                                                                        second window= 2SR

                                                                                                                                                        third window= 4SR

                                                                                                                                                        fourth window= 8SR

                                                                                                                                                        completetransmissionobject

                                                                                                                                                        delivered

                                                                                                                                                        time atclient

                                                                                                                                                        time atserver

                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                        How do we calculate K

                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                        +ge=

                                                                                                                                                        geminus=

                                                                                                                                                        ge+++=

                                                                                                                                                        ge+++=minus

                                                                                                                                                        minus

                                                                                                                                                        )1(log

                                                                                                                                                        )1(logmin

                                                                                                                                                        12min

                                                                                                                                                        222min222min

                                                                                                                                                        2

                                                                                                                                                        2

                                                                                                                                                        110

                                                                                                                                                        110

                                                                                                                                                        SO

                                                                                                                                                        SOkk

                                                                                                                                                        SOk

                                                                                                                                                        SOkOSSSkK

                                                                                                                                                        k

                                                                                                                                                        k

                                                                                                                                                        k

                                                                                                                                                        L

                                                                                                                                                        L

                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                        02468

                                                                                                                                                        101214161820

                                                                                                                                                        28Kbps

                                                                                                                                                        100Kbps

                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                        non-persistent

                                                                                                                                                        persistent

                                                                                                                                                        parallel non-persistent

                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                        0

                                                                                                                                                        10

                                                                                                                                                        20

                                                                                                                                                        30

                                                                                                                                                        40

                                                                                                                                                        50

                                                                                                                                                        60

                                                                                                                                                        70

                                                                                                                                                        28Kbps

                                                                                                                                                        100Kbps

                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                        non-persistent

                                                                                                                                                        persistent

                                                                                                                                                        parallel non-persistent

                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                        UDPTCP

                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • Transport services and protocols
                                                                                                                                                        • Transport vs network layer
                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                        • How demultiplexing works
                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                        • UDP more
                                                                                                                                                        • UDP checksum
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                        • Incremental Improvements
                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                        • rdt21 discussion
                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                        • rdt30 sender
                                                                                                                                                        • rdt30 in action
                                                                                                                                                        • rdt30 in action
                                                                                                                                                        • Performance of rdt30
                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                        • Pipelined protocols
                                                                                                                                                        • Pipelined protocols
                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                        • Go-Back-N
                                                                                                                                                        • GBN Sender
                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                        • More on receiver
                                                                                                                                                        • GBN inaction
                                                                                                                                                        • Selective Repeat
                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                        • Selective repeat
                                                                                                                                                        • Selective repeat in action
                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                        • More TCP Details
                                                                                                                                                        • Even More TCP Details
                                                                                                                                                        • TCP segment structure
                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                        • Example RTT estimation
                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                        • TCP sender events
                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                        • More on Sender Policies
                                                                                                                                                        • Fast Retransmit
                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • TCP Flow Control
                                                                                                                                                        • TCP Flow Control
                                                                                                                                                        • TCP segment structure
                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                        • Technical Issue
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • TCP Connection Management
                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                        • A few special cases
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                        • TCP AIMD
                                                                                                                                                        • TCP Slow Start
                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                        • The Big Picture
                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                        • TCP throughput
                                                                                                                                                        • TCP Futures
                                                                                                                                                        • TCP Fairness
                                                                                                                                                        • Why is TCP fair
                                                                                                                                                        • Fairness (more)
                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                        • HTTP Modeling
                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                          3 Transport Layer 77Comp 361 Spring 2005

                                                                                                                                                          TCP GBN or Selective Repeat

                                                                                                                                                          Basic TCP looks a lot like GBN

                                                                                                                                                          Many TCP implementations will buffer received out-of-order segments and then ACK them all after filling in the range

                                                                                                                                                          This looks a lot like Selective Repeat

                                                                                                                                                          TCP is a hybrid

                                                                                                                                                          3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                          Chapter 3 outline

                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                          3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                          TCP Flow Control

                                                                                                                                                          Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                          3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                          TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                          transmitting too muchtoo fast

                                                                                                                                                          flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                          speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                          app process may be slow at reading from buffer

                                                                                                                                                          3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                          TCP segment structure

                                                                                                                                                          source port dest port

                                                                                                                                                          32 bits

                                                                                                                                                          applicationdata

                                                                                                                                                          (variable length)

                                                                                                                                                          sequence numberacknowledgement number

                                                                                                                                                          Receive windowUrg data pnterchecksum

                                                                                                                                                          FSRPAUheadlen

                                                                                                                                                          notused

                                                                                                                                                          Options (variable length)

                                                                                                                                                          URG urgent data (generally not used)

                                                                                                                                                          ACK ACK valid

                                                                                                                                                          PSH push data now(generally not used)

                                                                                                                                                          RST SYN FINconnection estab(setup teardown

                                                                                                                                                          commands)

                                                                                                                                                          bytes rcvr willingto accept

                                                                                                                                                          Internetchecksum

                                                                                                                                                          (as in UDP)

                                                                                                                                                          countingby bytes of data(not segments)

                                                                                                                                                          3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                          TCP Flow control how it works

                                                                                                                                                          (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                          = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                          LastByteRead]

                                                                                                                                                          Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                          guarantees receive buffer doesnrsquot overflow

                                                                                                                                                          3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                          Technical Issue

                                                                                                                                                          Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                          Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                          3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                          Note on UDP

                                                                                                                                                          UDP has no flow control

                                                                                                                                                          UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                          Chapter 3 outline

                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                          TCP Connection Management

                                                                                                                                                          Three way handshakeStep 1 client end system sends

                                                                                                                                                          TCP SYN control segment to server

                                                                                                                                                          specifies client_isn the initial seq No application data

                                                                                                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                          Allocate buffersAllocates buffersCan include application data

                                                                                                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                          server

                                                                                                                                                          Connection granted (SYN=1 server_isn

                                                                                                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                          ack=client_isn+1)

                                                                                                                                                          ack=server_isn+1

                                                                                                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                          Closing a connection

                                                                                                                                                          client closes socketclientSocketclose()

                                                                                                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                          client

                                                                                                                                                          FIN

                                                                                                                                                          server

                                                                                                                                                          ACK

                                                                                                                                                          ACK

                                                                                                                                                          FIN

                                                                                                                                                          close

                                                                                                                                                          close

                                                                                                                                                          closed

                                                                                                                                                          tim

                                                                                                                                                          ed w

                                                                                                                                                          ait

                                                                                                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                          Step 3 client receives FIN replies with ACK

                                                                                                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                          Closes down after timed-wait

                                                                                                                                                          Step 4 server receives ACK Connection closed

                                                                                                                                                          Note with small modification can handle simultaneous FINs

                                                                                                                                                          client

                                                                                                                                                          FIN

                                                                                                                                                          server

                                                                                                                                                          ACK

                                                                                                                                                          ACK

                                                                                                                                                          FIN

                                                                                                                                                          closing

                                                                                                                                                          closing

                                                                                                                                                          closed

                                                                                                                                                          tim

                                                                                                                                                          ed w

                                                                                                                                                          ait

                                                                                                                                                          closed

                                                                                                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                          ExampleTCP serverlifecycle

                                                                                                                                                          Example TCP clientlifecycle

                                                                                                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                          A few special cases

                                                                                                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                          Chapter 3 outline

                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                          Principles of Congestion Control

                                                                                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                          a top-10 problem

                                                                                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                          large delays when congestedmaximum achievable throughput

                                                                                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                          Causescosts of congestion scenario 2

                                                                                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                          λin λout=

                                                                                                                                                          λin λoutgtλ

                                                                                                                                                          inλout

                                                                                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                          (c)(a) (b)

                                                                                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                          λin

                                                                                                                                                          Q what happens as and increase λ

                                                                                                                                                          in

                                                                                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                          Causescosts of congestion scenario 3

                                                                                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                          Approaches towards congestion control

                                                                                                                                                          Two broad approaches towards congestion control

                                                                                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                                                                                          small exception ndash see next page

                                                                                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                          sender should use available bandwidth

                                                                                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                          Chapter 3 outline

                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                          Congwin

                                                                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                          cut CongWin in half after loss event

                                                                                                                                                          8 Kbytes

                                                                                                                                                          16 Kbytes

                                                                                                                                                          24 Kbytes

                                                                                                                                                          time

                                                                                                                                                          congestionwindow

                                                                                                                                                          Long-lived TCP connection

                                                                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                          TCP Slow Start

                                                                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                          TCP Slow Start (more)

                                                                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                          Host A

                                                                                                                                                          one segment

                                                                                                                                                          RTT

                                                                                                                                                          Host B

                                                                                                                                                          time

                                                                                                                                                          two segments

                                                                                                                                                          four segments

                                                                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                          Summary TCP Congestion Control

                                                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                          The Big Picture

                                                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                          Slow Start (SS)

                                                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                          CongestionAvoidance (CA)

                                                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                          Enter slow start

                                                                                                                                                          Duplicate ACK

                                                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                          CongWin and Threshold not changed

                                                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                          TCP throughput

                                                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                          TCP Futures

                                                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                          LRTTMSSsdot221

                                                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                          TCP connection 1

                                                                                                                                                          bottleneckrouter

                                                                                                                                                          capacity R

                                                                                                                                                          TCP connection 2

                                                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                          R

                                                                                                                                                          R

                                                                                                                                                          equal bandwidth share

                                                                                                                                                          Connection 1 throughput

                                                                                                                                                          Conn

                                                                                                                                                          ecti

                                                                                                                                                          on 2

                                                                                                                                                          thr

                                                                                                                                                          ough

                                                                                                                                                          p ut

                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                          modeling slow start

                                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                          Fixed congestion window (1)

                                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                          latency = 2RTT + OR

                                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                          Fixed congestion window (2)

                                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                          Will show that the delay for one object is

                                                                                                                                                          RS

                                                                                                                                                          RSRTTP

                                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                                          RTT

                                                                                                                                                          initiate TCPconnection

                                                                                                                                                          requestobject

                                                                                                                                                          first window= SR

                                                                                                                                                          second window= 2SR

                                                                                                                                                          third window= 4SR

                                                                                                                                                          fourth window= 8SR

                                                                                                                                                          completetransmissionobject

                                                                                                                                                          delivered

                                                                                                                                                          time atclient

                                                                                                                                                          time atserver

                                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                          Server idles P=2 times

                                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                          RS

                                                                                                                                                          RSRTTPRTT

                                                                                                                                                          RO

                                                                                                                                                          RSRTT

                                                                                                                                                          RSRTT

                                                                                                                                                          RO

                                                                                                                                                          idleTimeRTTRO

                                                                                                                                                          P

                                                                                                                                                          kP

                                                                                                                                                          k

                                                                                                                                                          P

                                                                                                                                                          pp

                                                                                                                                                          )12(][2

                                                                                                                                                          ]2[2

                                                                                                                                                          2delay

                                                                                                                                                          1

                                                                                                                                                          1

                                                                                                                                                          1

                                                                                                                                                          minusminus+++=

                                                                                                                                                          minus+++=

                                                                                                                                                          ++=

                                                                                                                                                          minus

                                                                                                                                                          =

                                                                                                                                                          =

                                                                                                                                                          sum

                                                                                                                                                          sum

                                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                                          RS k =⎥⎦

                                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                                          +minus

                                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                                          RSk

                                                                                                                                                          RTT

                                                                                                                                                          initiate TCPconnection

                                                                                                                                                          requestobject

                                                                                                                                                          first window= SR

                                                                                                                                                          second window= 2SR

                                                                                                                                                          third window= 4SR

                                                                                                                                                          fourth window= 8SR

                                                                                                                                                          completetransmissionobject

                                                                                                                                                          delivered

                                                                                                                                                          time atclient

                                                                                                                                                          time atserver

                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                          How do we calculate K

                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                          +ge=

                                                                                                                                                          geminus=

                                                                                                                                                          ge+++=

                                                                                                                                                          ge+++=minus

                                                                                                                                                          minus

                                                                                                                                                          )1(log

                                                                                                                                                          )1(logmin

                                                                                                                                                          12min

                                                                                                                                                          222min222min

                                                                                                                                                          2

                                                                                                                                                          2

                                                                                                                                                          110

                                                                                                                                                          110

                                                                                                                                                          SO

                                                                                                                                                          SOkk

                                                                                                                                                          SOk

                                                                                                                                                          SOkOSSSkK

                                                                                                                                                          k

                                                                                                                                                          k

                                                                                                                                                          k

                                                                                                                                                          L

                                                                                                                                                          L

                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                          02468

                                                                                                                                                          101214161820

                                                                                                                                                          28Kbps

                                                                                                                                                          100Kbps

                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                          non-persistent

                                                                                                                                                          persistent

                                                                                                                                                          parallel non-persistent

                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                          0

                                                                                                                                                          10

                                                                                                                                                          20

                                                                                                                                                          30

                                                                                                                                                          40

                                                                                                                                                          50

                                                                                                                                                          60

                                                                                                                                                          70

                                                                                                                                                          28Kbps

                                                                                                                                                          100Kbps

                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                          non-persistent

                                                                                                                                                          persistent

                                                                                                                                                          parallel non-persistent

                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                          UDPTCP

                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • Transport services and protocols
                                                                                                                                                          • Transport vs network layer
                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                          • How demultiplexing works
                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                          • UDP more
                                                                                                                                                          • UDP checksum
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                          • Incremental Improvements
                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                          • rdt21 discussion
                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                          • rdt30 sender
                                                                                                                                                          • rdt30 in action
                                                                                                                                                          • rdt30 in action
                                                                                                                                                          • Performance of rdt30
                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                          • Pipelined protocols
                                                                                                                                                          • Pipelined protocols
                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                          • Go-Back-N
                                                                                                                                                          • GBN Sender
                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                          • More on receiver
                                                                                                                                                          • GBN inaction
                                                                                                                                                          • Selective Repeat
                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                          • Selective repeat
                                                                                                                                                          • Selective repeat in action
                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                          • More TCP Details
                                                                                                                                                          • Even More TCP Details
                                                                                                                                                          • TCP segment structure
                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                          • Example RTT estimation
                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                          • TCP sender events
                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                          • More on Sender Policies
                                                                                                                                                          • Fast Retransmit
                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • TCP Flow Control
                                                                                                                                                          • TCP Flow Control
                                                                                                                                                          • TCP segment structure
                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                          • Technical Issue
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • TCP Connection Management
                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                          • A few special cases
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                          • TCP AIMD
                                                                                                                                                          • TCP Slow Start
                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                          • The Big Picture
                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                          • TCP throughput
                                                                                                                                                          • TCP Futures
                                                                                                                                                          • TCP Fairness
                                                                                                                                                          • Why is TCP fair
                                                                                                                                                          • Fairness (more)
                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                          • HTTP Modeling
                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                            3 Transport Layer 78Comp 361 Spring 2005

                                                                                                                                                            Chapter 3 outline

                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                            3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                            TCP Flow Control

                                                                                                                                                            Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                            3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                            TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                            transmitting too muchtoo fast

                                                                                                                                                            flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                            speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                            app process may be slow at reading from buffer

                                                                                                                                                            3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                            TCP segment structure

                                                                                                                                                            source port dest port

                                                                                                                                                            32 bits

                                                                                                                                                            applicationdata

                                                                                                                                                            (variable length)

                                                                                                                                                            sequence numberacknowledgement number

                                                                                                                                                            Receive windowUrg data pnterchecksum

                                                                                                                                                            FSRPAUheadlen

                                                                                                                                                            notused

                                                                                                                                                            Options (variable length)

                                                                                                                                                            URG urgent data (generally not used)

                                                                                                                                                            ACK ACK valid

                                                                                                                                                            PSH push data now(generally not used)

                                                                                                                                                            RST SYN FINconnection estab(setup teardown

                                                                                                                                                            commands)

                                                                                                                                                            bytes rcvr willingto accept

                                                                                                                                                            Internetchecksum

                                                                                                                                                            (as in UDP)

                                                                                                                                                            countingby bytes of data(not segments)

                                                                                                                                                            3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                            TCP Flow control how it works

                                                                                                                                                            (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                            = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                            LastByteRead]

                                                                                                                                                            Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                            guarantees receive buffer doesnrsquot overflow

                                                                                                                                                            3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                            Technical Issue

                                                                                                                                                            Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                            Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                            3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                            Note on UDP

                                                                                                                                                            UDP has no flow control

                                                                                                                                                            UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                            3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                            Chapter 3 outline

                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                            TCP Connection Management

                                                                                                                                                            Three way handshakeStep 1 client end system sends

                                                                                                                                                            TCP SYN control segment to server

                                                                                                                                                            specifies client_isn the initial seq No application data

                                                                                                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                            Allocate buffersAllocates buffersCan include application data

                                                                                                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                            server

                                                                                                                                                            Connection granted (SYN=1 server_isn

                                                                                                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                            ack=client_isn+1)

                                                                                                                                                            ack=server_isn+1

                                                                                                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                            Closing a connection

                                                                                                                                                            client closes socketclientSocketclose()

                                                                                                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                            client

                                                                                                                                                            FIN

                                                                                                                                                            server

                                                                                                                                                            ACK

                                                                                                                                                            ACK

                                                                                                                                                            FIN

                                                                                                                                                            close

                                                                                                                                                            close

                                                                                                                                                            closed

                                                                                                                                                            tim

                                                                                                                                                            ed w

                                                                                                                                                            ait

                                                                                                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                            Step 3 client receives FIN replies with ACK

                                                                                                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                            Closes down after timed-wait

                                                                                                                                                            Step 4 server receives ACK Connection closed

                                                                                                                                                            Note with small modification can handle simultaneous FINs

                                                                                                                                                            client

                                                                                                                                                            FIN

                                                                                                                                                            server

                                                                                                                                                            ACK

                                                                                                                                                            ACK

                                                                                                                                                            FIN

                                                                                                                                                            closing

                                                                                                                                                            closing

                                                                                                                                                            closed

                                                                                                                                                            tim

                                                                                                                                                            ed w

                                                                                                                                                            ait

                                                                                                                                                            closed

                                                                                                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                            ExampleTCP serverlifecycle

                                                                                                                                                            Example TCP clientlifecycle

                                                                                                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                            A few special cases

                                                                                                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                            Chapter 3 outline

                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                            Principles of Congestion Control

                                                                                                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                            a top-10 problem

                                                                                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                            large delays when congestedmaximum achievable throughput

                                                                                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                            Causescosts of congestion scenario 2

                                                                                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                            λin λout=

                                                                                                                                                            λin λoutgtλ

                                                                                                                                                            inλout

                                                                                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                            (c)(a) (b)

                                                                                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                            λin

                                                                                                                                                            Q what happens as and increase λ

                                                                                                                                                            in

                                                                                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                            Causescosts of congestion scenario 3

                                                                                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                            Approaches towards congestion control

                                                                                                                                                            Two broad approaches towards congestion control

                                                                                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                                                                                            small exception ndash see next page

                                                                                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                            sender should use available bandwidth

                                                                                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                            Chapter 3 outline

                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                            Congwin

                                                                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                            cut CongWin in half after loss event

                                                                                                                                                            8 Kbytes

                                                                                                                                                            16 Kbytes

                                                                                                                                                            24 Kbytes

                                                                                                                                                            time

                                                                                                                                                            congestionwindow

                                                                                                                                                            Long-lived TCP connection

                                                                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                            TCP Slow Start

                                                                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                            TCP Slow Start (more)

                                                                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                            Host A

                                                                                                                                                            one segment

                                                                                                                                                            RTT

                                                                                                                                                            Host B

                                                                                                                                                            time

                                                                                                                                                            two segments

                                                                                                                                                            four segments

                                                                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                            Summary TCP Congestion Control

                                                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                            The Big Picture

                                                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                            Slow Start (SS)

                                                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                            CongestionAvoidance (CA)

                                                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                            Enter slow start

                                                                                                                                                            Duplicate ACK

                                                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                            CongWin and Threshold not changed

                                                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                            TCP throughput

                                                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                            TCP Futures

                                                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                            LRTTMSSsdot221

                                                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                            TCP connection 1

                                                                                                                                                            bottleneckrouter

                                                                                                                                                            capacity R

                                                                                                                                                            TCP connection 2

                                                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                            R

                                                                                                                                                            R

                                                                                                                                                            equal bandwidth share

                                                                                                                                                            Connection 1 throughput

                                                                                                                                                            Conn

                                                                                                                                                            ecti

                                                                                                                                                            on 2

                                                                                                                                                            thr

                                                                                                                                                            ough

                                                                                                                                                            p ut

                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                                                            do not want rate throttled by congestion control

                                                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                            modeling slow start

                                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                            Fixed congestion window (1)

                                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                            latency = 2RTT + OR

                                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                            Fixed congestion window (2)

                                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                            Will show that the delay for one object is

                                                                                                                                                            RS

                                                                                                                                                            RSRTTP

                                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                                            RTT

                                                                                                                                                            initiate TCPconnection

                                                                                                                                                            requestobject

                                                                                                                                                            first window= SR

                                                                                                                                                            second window= 2SR

                                                                                                                                                            third window= 4SR

                                                                                                                                                            fourth window= 8SR

                                                                                                                                                            completetransmissionobject

                                                                                                                                                            delivered

                                                                                                                                                            time atclient

                                                                                                                                                            time atserver

                                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                            Server idles P=2 times

                                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                            RS

                                                                                                                                                            RSRTTPRTT

                                                                                                                                                            RO

                                                                                                                                                            RSRTT

                                                                                                                                                            RSRTT

                                                                                                                                                            RO

                                                                                                                                                            idleTimeRTTRO

                                                                                                                                                            P

                                                                                                                                                            kP

                                                                                                                                                            k

                                                                                                                                                            P

                                                                                                                                                            pp

                                                                                                                                                            )12(][2

                                                                                                                                                            ]2[2

                                                                                                                                                            2delay

                                                                                                                                                            1

                                                                                                                                                            1

                                                                                                                                                            1

                                                                                                                                                            minusminus+++=

                                                                                                                                                            minus+++=

                                                                                                                                                            ++=

                                                                                                                                                            minus

                                                                                                                                                            =

                                                                                                                                                            =

                                                                                                                                                            sum

                                                                                                                                                            sum

                                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                                            RS k =⎥⎦

                                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                                            +minus

                                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                                            RSk

                                                                                                                                                            RTT

                                                                                                                                                            initiate TCPconnection

                                                                                                                                                            requestobject

                                                                                                                                                            first window= SR

                                                                                                                                                            second window= 2SR

                                                                                                                                                            third window= 4SR

                                                                                                                                                            fourth window= 8SR

                                                                                                                                                            completetransmissionobject

                                                                                                                                                            delivered

                                                                                                                                                            time atclient

                                                                                                                                                            time atserver

                                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                            How do we calculate K

                                                                                                                                                            ⎥⎥⎤

                                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                                            +ge=

                                                                                                                                                            geminus=

                                                                                                                                                            ge+++=

                                                                                                                                                            ge+++=minus

                                                                                                                                                            minus

                                                                                                                                                            )1(log

                                                                                                                                                            )1(logmin

                                                                                                                                                            12min

                                                                                                                                                            222min222min

                                                                                                                                                            2

                                                                                                                                                            2

                                                                                                                                                            110

                                                                                                                                                            110

                                                                                                                                                            SO

                                                                                                                                                            SOkk

                                                                                                                                                            SOk

                                                                                                                                                            SOkOSSSkK

                                                                                                                                                            k

                                                                                                                                                            k

                                                                                                                                                            k

                                                                                                                                                            L

                                                                                                                                                            L

                                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                            02468

                                                                                                                                                            101214161820

                                                                                                                                                            28Kbps

                                                                                                                                                            100Kbps

                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                            non-persistent

                                                                                                                                                            persistent

                                                                                                                                                            parallel non-persistent

                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                            0

                                                                                                                                                            10

                                                                                                                                                            20

                                                                                                                                                            30

                                                                                                                                                            40

                                                                                                                                                            50

                                                                                                                                                            60

                                                                                                                                                            70

                                                                                                                                                            28Kbps

                                                                                                                                                            100Kbps

                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                            non-persistent

                                                                                                                                                            persistent

                                                                                                                                                            parallel non-persistent

                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                            UDPTCP

                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • Transport services and protocols
                                                                                                                                                            • Transport vs network layer
                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                            • How demultiplexing works
                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                            • UDP more
                                                                                                                                                            • UDP checksum
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                            • Incremental Improvements
                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                            • rdt21 discussion
                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                            • rdt30 sender
                                                                                                                                                            • rdt30 in action
                                                                                                                                                            • rdt30 in action
                                                                                                                                                            • Performance of rdt30
                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                            • Pipelined protocols
                                                                                                                                                            • Pipelined protocols
                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                            • Go-Back-N
                                                                                                                                                            • GBN Sender
                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                            • More on receiver
                                                                                                                                                            • GBN inaction
                                                                                                                                                            • Selective Repeat
                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                            • Selective repeat
                                                                                                                                                            • Selective repeat in action
                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                            • More TCP Details
                                                                                                                                                            • Even More TCP Details
                                                                                                                                                            • TCP segment structure
                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                            • Example RTT estimation
                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                            • TCP sender events
                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                            • More on Sender Policies
                                                                                                                                                            • Fast Retransmit
                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • TCP Flow Control
                                                                                                                                                            • TCP Flow Control
                                                                                                                                                            • TCP segment structure
                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                            • Technical Issue
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • TCP Connection Management
                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                            • A few special cases
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                            • TCP AIMD
                                                                                                                                                            • TCP Slow Start
                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                            • The Big Picture
                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                            • TCP throughput
                                                                                                                                                            • TCP Futures
                                                                                                                                                            • TCP Fairness
                                                                                                                                                            • Why is TCP fair
                                                                                                                                                            • Fairness (more)
                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                            • HTTP Modeling
                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                              3 Transport Layer 79Comp 361 Spring 2005

                                                                                                                                                              TCP Flow Control

                                                                                                                                                              Sender should not overwhelm receiverrsquos capacity to receive dataIf necessary sender should slow down transmission rate to accommodate receiverrsquos rateDifferent from Congestion Control whose purpose was to handle congestion in network (But both congestion control and flow control work by slowing down data transmission)

                                                                                                                                                              3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                              TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                              transmitting too muchtoo fast

                                                                                                                                                              flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                              speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                              app process may be slow at reading from buffer

                                                                                                                                                              3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                              TCP segment structure

                                                                                                                                                              source port dest port

                                                                                                                                                              32 bits

                                                                                                                                                              applicationdata

                                                                                                                                                              (variable length)

                                                                                                                                                              sequence numberacknowledgement number

                                                                                                                                                              Receive windowUrg data pnterchecksum

                                                                                                                                                              FSRPAUheadlen

                                                                                                                                                              notused

                                                                                                                                                              Options (variable length)

                                                                                                                                                              URG urgent data (generally not used)

                                                                                                                                                              ACK ACK valid

                                                                                                                                                              PSH push data now(generally not used)

                                                                                                                                                              RST SYN FINconnection estab(setup teardown

                                                                                                                                                              commands)

                                                                                                                                                              bytes rcvr willingto accept

                                                                                                                                                              Internetchecksum

                                                                                                                                                              (as in UDP)

                                                                                                                                                              countingby bytes of data(not segments)

                                                                                                                                                              3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                              TCP Flow control how it works

                                                                                                                                                              (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                              = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                              LastByteRead]

                                                                                                                                                              Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                              guarantees receive buffer doesnrsquot overflow

                                                                                                                                                              3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                              Technical Issue

                                                                                                                                                              Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                              Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                              3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                              Note on UDP

                                                                                                                                                              UDP has no flow control

                                                                                                                                                              UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                              3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                              Chapter 3 outline

                                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                                              3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                              TCP Connection Management

                                                                                                                                                              Three way handshakeStep 1 client end system sends

                                                                                                                                                              TCP SYN control segment to server

                                                                                                                                                              specifies client_isn the initial seq No application data

                                                                                                                                                              Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                              ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                              Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                              seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                              client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                              server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                              Allocate buffersAllocates buffersCan include application data

                                                                                                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                              server

                                                                                                                                                              Connection granted (SYN=1 server_isn

                                                                                                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                              ack=client_isn+1)

                                                                                                                                                              ack=server_isn+1

                                                                                                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                              Closing a connection

                                                                                                                                                              client closes socketclientSocketclose()

                                                                                                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                              client

                                                                                                                                                              FIN

                                                                                                                                                              server

                                                                                                                                                              ACK

                                                                                                                                                              ACK

                                                                                                                                                              FIN

                                                                                                                                                              close

                                                                                                                                                              close

                                                                                                                                                              closed

                                                                                                                                                              tim

                                                                                                                                                              ed w

                                                                                                                                                              ait

                                                                                                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                              Step 3 client receives FIN replies with ACK

                                                                                                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                              Closes down after timed-wait

                                                                                                                                                              Step 4 server receives ACK Connection closed

                                                                                                                                                              Note with small modification can handle simultaneous FINs

                                                                                                                                                              client

                                                                                                                                                              FIN

                                                                                                                                                              server

                                                                                                                                                              ACK

                                                                                                                                                              ACK

                                                                                                                                                              FIN

                                                                                                                                                              closing

                                                                                                                                                              closing

                                                                                                                                                              closed

                                                                                                                                                              tim

                                                                                                                                                              ed w

                                                                                                                                                              ait

                                                                                                                                                              closed

                                                                                                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                              ExampleTCP serverlifecycle

                                                                                                                                                              Example TCP clientlifecycle

                                                                                                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                              A few special cases

                                                                                                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                              Chapter 3 outline

                                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                              Principles of Congestion Control

                                                                                                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                              a top-10 problem

                                                                                                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                              large delays when congestedmaximum achievable throughput

                                                                                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                              Causescosts of congestion scenario 2

                                                                                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                              λin λout=

                                                                                                                                                              λin λoutgtλ

                                                                                                                                                              inλout

                                                                                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                              (c)(a) (b)

                                                                                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                              λin

                                                                                                                                                              Q what happens as and increase λ

                                                                                                                                                              in

                                                                                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                              Causescosts of congestion scenario 3

                                                                                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                              Approaches towards congestion control

                                                                                                                                                              Two broad approaches towards congestion control

                                                                                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                                                                                              small exception ndash see next page

                                                                                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                              sender should use available bandwidth

                                                                                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                              Chapter 3 outline

                                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                              Congwin

                                                                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                              cut CongWin in half after loss event

                                                                                                                                                              8 Kbytes

                                                                                                                                                              16 Kbytes

                                                                                                                                                              24 Kbytes

                                                                                                                                                              time

                                                                                                                                                              congestionwindow

                                                                                                                                                              Long-lived TCP connection

                                                                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                              TCP Slow Start

                                                                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                              TCP Slow Start (more)

                                                                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                              Host A

                                                                                                                                                              one segment

                                                                                                                                                              RTT

                                                                                                                                                              Host B

                                                                                                                                                              time

                                                                                                                                                              two segments

                                                                                                                                                              four segments

                                                                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                              Summary TCP Congestion Control

                                                                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                              The Big Picture

                                                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                              Slow Start (SS)

                                                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                              CongestionAvoidance (CA)

                                                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                              Enter slow start

                                                                                                                                                              Duplicate ACK

                                                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                              CongWin and Threshold not changed

                                                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                              TCP throughput

                                                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                              TCP Futures

                                                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                              LRTTMSSsdot221

                                                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                              TCP connection 1

                                                                                                                                                              bottleneckrouter

                                                                                                                                                              capacity R

                                                                                                                                                              TCP connection 2

                                                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                              R

                                                                                                                                                              R

                                                                                                                                                              equal bandwidth share

                                                                                                                                                              Connection 1 throughput

                                                                                                                                                              Conn

                                                                                                                                                              ecti

                                                                                                                                                              on 2

                                                                                                                                                              thr

                                                                                                                                                              ough

                                                                                                                                                              p ut

                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                                                              do not want rate throttled by congestion control

                                                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                              modeling slow start

                                                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                              Fixed congestion window (1)

                                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                              latency = 2RTT + OR

                                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                              Fixed congestion window (2)

                                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                              Will show that the delay for one object is

                                                                                                                                                              RS

                                                                                                                                                              RSRTTP

                                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                                              RTT

                                                                                                                                                              initiate TCPconnection

                                                                                                                                                              requestobject

                                                                                                                                                              first window= SR

                                                                                                                                                              second window= 2SR

                                                                                                                                                              third window= 4SR

                                                                                                                                                              fourth window= 8SR

                                                                                                                                                              completetransmissionobject

                                                                                                                                                              delivered

                                                                                                                                                              time atclient

                                                                                                                                                              time atserver

                                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                              Server idles P=2 times

                                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                              RS

                                                                                                                                                              RSRTTPRTT

                                                                                                                                                              RO

                                                                                                                                                              RSRTT

                                                                                                                                                              RSRTT

                                                                                                                                                              RO

                                                                                                                                                              idleTimeRTTRO

                                                                                                                                                              P

                                                                                                                                                              kP

                                                                                                                                                              k

                                                                                                                                                              P

                                                                                                                                                              pp

                                                                                                                                                              )12(][2

                                                                                                                                                              ]2[2

                                                                                                                                                              2delay

                                                                                                                                                              1

                                                                                                                                                              1

                                                                                                                                                              1

                                                                                                                                                              minusminus+++=

                                                                                                                                                              minus+++=

                                                                                                                                                              ++=

                                                                                                                                                              minus

                                                                                                                                                              =

                                                                                                                                                              =

                                                                                                                                                              sum

                                                                                                                                                              sum

                                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                                              RS k =⎥⎦

                                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                                              +minus

                                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                                              RSk

                                                                                                                                                              RTT

                                                                                                                                                              initiate TCPconnection

                                                                                                                                                              requestobject

                                                                                                                                                              first window= SR

                                                                                                                                                              second window= 2SR

                                                                                                                                                              third window= 4SR

                                                                                                                                                              fourth window= 8SR

                                                                                                                                                              completetransmissionobject

                                                                                                                                                              delivered

                                                                                                                                                              time atclient

                                                                                                                                                              time atserver

                                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                              How do we calculate K

                                                                                                                                                              ⎥⎥⎤

                                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                                              +ge=

                                                                                                                                                              geminus=

                                                                                                                                                              ge+++=

                                                                                                                                                              ge+++=minus

                                                                                                                                                              minus

                                                                                                                                                              )1(log

                                                                                                                                                              )1(logmin

                                                                                                                                                              12min

                                                                                                                                                              222min222min

                                                                                                                                                              2

                                                                                                                                                              2

                                                                                                                                                              110

                                                                                                                                                              110

                                                                                                                                                              SO

                                                                                                                                                              SOkk

                                                                                                                                                              SOk

                                                                                                                                                              SOkOSSSkK

                                                                                                                                                              k

                                                                                                                                                              k

                                                                                                                                                              k

                                                                                                                                                              L

                                                                                                                                                              L

                                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                              02468

                                                                                                                                                              101214161820

                                                                                                                                                              28Kbps

                                                                                                                                                              100Kbps

                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                              non-persistent

                                                                                                                                                              persistent

                                                                                                                                                              parallel non-persistent

                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                              0

                                                                                                                                                              10

                                                                                                                                                              20

                                                                                                                                                              30

                                                                                                                                                              40

                                                                                                                                                              50

                                                                                                                                                              60

                                                                                                                                                              70

                                                                                                                                                              28Kbps

                                                                                                                                                              100Kbps

                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                              non-persistent

                                                                                                                                                              persistent

                                                                                                                                                              parallel non-persistent

                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                              UDPTCP

                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • Transport services and protocols
                                                                                                                                                              • Transport vs network layer
                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                              • How demultiplexing works
                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                              • UDP more
                                                                                                                                                              • UDP checksum
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                              • Incremental Improvements
                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                              • rdt21 discussion
                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                              • rdt30 sender
                                                                                                                                                              • rdt30 in action
                                                                                                                                                              • rdt30 in action
                                                                                                                                                              • Performance of rdt30
                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                              • Pipelined protocols
                                                                                                                                                              • Pipelined protocols
                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                              • Go-Back-N
                                                                                                                                                              • GBN Sender
                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                              • More on receiver
                                                                                                                                                              • GBN inaction
                                                                                                                                                              • Selective Repeat
                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                              • Selective repeat
                                                                                                                                                              • Selective repeat in action
                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                              • More TCP Details
                                                                                                                                                              • Even More TCP Details
                                                                                                                                                              • TCP segment structure
                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                              • Example RTT estimation
                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                              • TCP sender events
                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                              • More on Sender Policies
                                                                                                                                                              • Fast Retransmit
                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • TCP Flow Control
                                                                                                                                                              • TCP Flow Control
                                                                                                                                                              • TCP segment structure
                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                              • Technical Issue
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • TCP Connection Management
                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                              • A few special cases
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                              • TCP AIMD
                                                                                                                                                              • TCP Slow Start
                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                              • The Big Picture
                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                              • TCP throughput
                                                                                                                                                              • TCP Futures
                                                                                                                                                              • TCP Fairness
                                                                                                                                                              • Why is TCP fair
                                                                                                                                                              • Fairness (more)
                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                              • HTTP Modeling
                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                3 Transport Layer 80Comp 361 Spring 2005

                                                                                                                                                                TCP Flow Controlsender wonrsquot overflowreceiverrsquos buffer by

                                                                                                                                                                transmitting too muchtoo fast

                                                                                                                                                                flow controlreceive side of TCP connection has a receive buffer

                                                                                                                                                                speed-matching service matching the send rate to the receiving apprsquos drain rate

                                                                                                                                                                app process may be slow at reading from buffer

                                                                                                                                                                3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                                TCP segment structure

                                                                                                                                                                source port dest port

                                                                                                                                                                32 bits

                                                                                                                                                                applicationdata

                                                                                                                                                                (variable length)

                                                                                                                                                                sequence numberacknowledgement number

                                                                                                                                                                Receive windowUrg data pnterchecksum

                                                                                                                                                                FSRPAUheadlen

                                                                                                                                                                notused

                                                                                                                                                                Options (variable length)

                                                                                                                                                                URG urgent data (generally not used)

                                                                                                                                                                ACK ACK valid

                                                                                                                                                                PSH push data now(generally not used)

                                                                                                                                                                RST SYN FINconnection estab(setup teardown

                                                                                                                                                                commands)

                                                                                                                                                                bytes rcvr willingto accept

                                                                                                                                                                Internetchecksum

                                                                                                                                                                (as in UDP)

                                                                                                                                                                countingby bytes of data(not segments)

                                                                                                                                                                3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                                TCP Flow control how it works

                                                                                                                                                                (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                                = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                                LastByteRead]

                                                                                                                                                                Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                                guarantees receive buffer doesnrsquot overflow

                                                                                                                                                                3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                                Technical Issue

                                                                                                                                                                Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                                Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                                3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                                Note on UDP

                                                                                                                                                                UDP has no flow control

                                                                                                                                                                UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                                3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                                Chapter 3 outline

                                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                TCP Connection Management

                                                                                                                                                                Three way handshakeStep 1 client end system sends

                                                                                                                                                                TCP SYN control segment to server

                                                                                                                                                                specifies client_isn the initial seq No application data

                                                                                                                                                                Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                server

                                                                                                                                                                Connection granted (SYN=1 server_isn

                                                                                                                                                                ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                ack=client_isn+1)

                                                                                                                                                                ack=server_isn+1

                                                                                                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                Closing a connection

                                                                                                                                                                client closes socketclientSocketclose()

                                                                                                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                client

                                                                                                                                                                FIN

                                                                                                                                                                server

                                                                                                                                                                ACK

                                                                                                                                                                ACK

                                                                                                                                                                FIN

                                                                                                                                                                close

                                                                                                                                                                close

                                                                                                                                                                closed

                                                                                                                                                                tim

                                                                                                                                                                ed w

                                                                                                                                                                ait

                                                                                                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                Step 3 client receives FIN replies with ACK

                                                                                                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                Closes down after timed-wait

                                                                                                                                                                Step 4 server receives ACK Connection closed

                                                                                                                                                                Note with small modification can handle simultaneous FINs

                                                                                                                                                                client

                                                                                                                                                                FIN

                                                                                                                                                                server

                                                                                                                                                                ACK

                                                                                                                                                                ACK

                                                                                                                                                                FIN

                                                                                                                                                                closing

                                                                                                                                                                closing

                                                                                                                                                                closed

                                                                                                                                                                tim

                                                                                                                                                                ed w

                                                                                                                                                                ait

                                                                                                                                                                closed

                                                                                                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                ExampleTCP serverlifecycle

                                                                                                                                                                Example TCP clientlifecycle

                                                                                                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                A few special cases

                                                                                                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                Chapter 3 outline

                                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                Principles of Congestion Control

                                                                                                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                a top-10 problem

                                                                                                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                large delays when congestedmaximum achievable throughput

                                                                                                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                Causescosts of congestion scenario 2

                                                                                                                                                                one router finite buffers sender retransmission of lost packet

                                                                                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                λin λout=

                                                                                                                                                                λin λoutgtλ

                                                                                                                                                                inλout

                                                                                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                (c)(a) (b)

                                                                                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                λin

                                                                                                                                                                Q what happens as and increase λ

                                                                                                                                                                in

                                                                                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                Causescosts of congestion scenario 3

                                                                                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                Approaches towards congestion control

                                                                                                                                                                Two broad approaches towards congestion control

                                                                                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                                                                                small exception ndash see next page

                                                                                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                sender should use available bandwidth

                                                                                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                Chapter 3 outline

                                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                Congwin

                                                                                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                                                                                throughput = w MSSRTT Bytessec

                                                                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                cut CongWin in half after loss event

                                                                                                                                                                8 Kbytes

                                                                                                                                                                16 Kbytes

                                                                                                                                                                24 Kbytes

                                                                                                                                                                time

                                                                                                                                                                congestionwindow

                                                                                                                                                                Long-lived TCP connection

                                                                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                TCP Slow Start

                                                                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                TCP Slow Start (more)

                                                                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                Host A

                                                                                                                                                                one segment

                                                                                                                                                                RTT

                                                                                                                                                                Host B

                                                                                                                                                                time

                                                                                                                                                                two segments

                                                                                                                                                                four segments

                                                                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                Summary TCP Congestion Control

                                                                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                The Big Picture

                                                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                Slow Start (SS)

                                                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                Enter slow start

                                                                                                                                                                Duplicate ACK

                                                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                CongWin and Threshold not changed

                                                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                TCP throughput

                                                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                TCP Futures

                                                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                LRTTMSSsdot221

                                                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                TCP connection 1

                                                                                                                                                                bottleneckrouter

                                                                                                                                                                capacity R

                                                                                                                                                                TCP connection 2

                                                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                R

                                                                                                                                                                R

                                                                                                                                                                equal bandwidth share

                                                                                                                                                                Connection 1 throughput

                                                                                                                                                                Conn

                                                                                                                                                                ecti

                                                                                                                                                                on 2

                                                                                                                                                                thr

                                                                                                                                                                ough

                                                                                                                                                                p ut

                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                modeling slow start

                                                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                                RS

                                                                                                                                                                RSRTTP

                                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                                RTT

                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                requestobject

                                                                                                                                                                first window= SR

                                                                                                                                                                second window= 2SR

                                                                                                                                                                third window= 4SR

                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                completetransmissionobject

                                                                                                                                                                delivered

                                                                                                                                                                time atclient

                                                                                                                                                                time atserver

                                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                Server idles P=2 times

                                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                RS

                                                                                                                                                                RSRTTPRTT

                                                                                                                                                                RO

                                                                                                                                                                RSRTT

                                                                                                                                                                RSRTT

                                                                                                                                                                RO

                                                                                                                                                                idleTimeRTTRO

                                                                                                                                                                P

                                                                                                                                                                kP

                                                                                                                                                                k

                                                                                                                                                                P

                                                                                                                                                                pp

                                                                                                                                                                )12(][2

                                                                                                                                                                ]2[2

                                                                                                                                                                2delay

                                                                                                                                                                1

                                                                                                                                                                1

                                                                                                                                                                1

                                                                                                                                                                minusminus+++=

                                                                                                                                                                minus+++=

                                                                                                                                                                ++=

                                                                                                                                                                minus

                                                                                                                                                                =

                                                                                                                                                                =

                                                                                                                                                                sum

                                                                                                                                                                sum

                                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                RS k =⎥⎦

                                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                                +minus

                                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                                RSk

                                                                                                                                                                RTT

                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                requestobject

                                                                                                                                                                first window= SR

                                                                                                                                                                second window= 2SR

                                                                                                                                                                third window= 4SR

                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                completetransmissionobject

                                                                                                                                                                delivered

                                                                                                                                                                time atclient

                                                                                                                                                                time atserver

                                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                How do we calculate K

                                                                                                                                                                ⎥⎥⎤

                                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                                +ge=

                                                                                                                                                                geminus=

                                                                                                                                                                ge+++=

                                                                                                                                                                ge+++=minus

                                                                                                                                                                minus

                                                                                                                                                                )1(log

                                                                                                                                                                )1(logmin

                                                                                                                                                                12min

                                                                                                                                                                222min222min

                                                                                                                                                                2

                                                                                                                                                                2

                                                                                                                                                                110

                                                                                                                                                                110

                                                                                                                                                                SO

                                                                                                                                                                SOkk

                                                                                                                                                                SOk

                                                                                                                                                                SOkOSSSkK

                                                                                                                                                                k

                                                                                                                                                                k

                                                                                                                                                                k

                                                                                                                                                                L

                                                                                                                                                                L

                                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                02468

                                                                                                                                                                101214161820

                                                                                                                                                                28Kbps

                                                                                                                                                                100Kbps

                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                non-persistent

                                                                                                                                                                persistent

                                                                                                                                                                parallel non-persistent

                                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                0

                                                                                                                                                                10

                                                                                                                                                                20

                                                                                                                                                                30

                                                                                                                                                                40

                                                                                                                                                                50

                                                                                                                                                                60

                                                                                                                                                                70

                                                                                                                                                                28Kbps

                                                                                                                                                                100Kbps

                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                non-persistent

                                                                                                                                                                persistent

                                                                                                                                                                parallel non-persistent

                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                UDPTCP

                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                • UDP more
                                                                                                                                                                • UDP checksum
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                • rdt30 sender
                                                                                                                                                                • rdt30 in action
                                                                                                                                                                • rdt30 in action
                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                • Go-Back-N
                                                                                                                                                                • GBN Sender
                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                • More on receiver
                                                                                                                                                                • GBN inaction
                                                                                                                                                                • Selective Repeat
                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                • Selective repeat
                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                • More TCP Details
                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                • TCP segment structure
                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                • TCP sender events
                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                • TCP segment structure
                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                • Technical Issue
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                • A few special cases
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                • TCP AIMD
                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                • The Big Picture
                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                • TCP throughput
                                                                                                                                                                • TCP Futures
                                                                                                                                                                • TCP Fairness
                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                • Fairness (more)
                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                  3 Transport Layer 81Comp 361 Spring 2005

                                                                                                                                                                  TCP segment structure

                                                                                                                                                                  source port dest port

                                                                                                                                                                  32 bits

                                                                                                                                                                  applicationdata

                                                                                                                                                                  (variable length)

                                                                                                                                                                  sequence numberacknowledgement number

                                                                                                                                                                  Receive windowUrg data pnterchecksum

                                                                                                                                                                  FSRPAUheadlen

                                                                                                                                                                  notused

                                                                                                                                                                  Options (variable length)

                                                                                                                                                                  URG urgent data (generally not used)

                                                                                                                                                                  ACK ACK valid

                                                                                                                                                                  PSH push data now(generally not used)

                                                                                                                                                                  RST SYN FINconnection estab(setup teardown

                                                                                                                                                                  commands)

                                                                                                                                                                  bytes rcvr willingto accept

                                                                                                                                                                  Internetchecksum

                                                                                                                                                                  (as in UDP)

                                                                                                                                                                  countingby bytes of data(not segments)

                                                                                                                                                                  3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                                  TCP Flow control how it works

                                                                                                                                                                  (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                                  = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                                  LastByteRead]

                                                                                                                                                                  Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                                  guarantees receive buffer doesnrsquot overflow

                                                                                                                                                                  3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                                  Technical Issue

                                                                                                                                                                  Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                                  Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                                  3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                                  Note on UDP

                                                                                                                                                                  UDP has no flow control

                                                                                                                                                                  UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                                  3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                                  Chapter 3 outline

                                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                  3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                  TCP Connection Management

                                                                                                                                                                  Three way handshakeStep 1 client end system sends

                                                                                                                                                                  TCP SYN control segment to server

                                                                                                                                                                  specifies client_isn the initial seq No application data

                                                                                                                                                                  Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                  ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                  Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                  seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                  client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                  server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                  3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                                  Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                  Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                  SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                  clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                  server

                                                                                                                                                                  Connection granted (SYN=1 server_isn

                                                                                                                                                                  ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                  ack=client_isn+1)

                                                                                                                                                                  ack=server_isn+1

                                                                                                                                                                  3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                                  Closing a connection

                                                                                                                                                                  client closes socketclientSocketclose()

                                                                                                                                                                  Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                  Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                  client

                                                                                                                                                                  FIN

                                                                                                                                                                  server

                                                                                                                                                                  ACK

                                                                                                                                                                  ACK

                                                                                                                                                                  FIN

                                                                                                                                                                  close

                                                                                                                                                                  close

                                                                                                                                                                  closed

                                                                                                                                                                  tim

                                                                                                                                                                  ed w

                                                                                                                                                                  ait

                                                                                                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                                  Step 3 client receives FIN replies with ACK

                                                                                                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                  Closes down after timed-wait

                                                                                                                                                                  Step 4 server receives ACK Connection closed

                                                                                                                                                                  Note with small modification can handle simultaneous FINs

                                                                                                                                                                  client

                                                                                                                                                                  FIN

                                                                                                                                                                  server

                                                                                                                                                                  ACK

                                                                                                                                                                  ACK

                                                                                                                                                                  FIN

                                                                                                                                                                  closing

                                                                                                                                                                  closing

                                                                                                                                                                  closed

                                                                                                                                                                  tim

                                                                                                                                                                  ed w

                                                                                                                                                                  ait

                                                                                                                                                                  closed

                                                                                                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                                  ExampleTCP serverlifecycle

                                                                                                                                                                  Example TCP clientlifecycle

                                                                                                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                  A few special cases

                                                                                                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                  Chapter 3 outline

                                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                  Principles of Congestion Control

                                                                                                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                  a top-10 problem

                                                                                                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                  large delays when congestedmaximum achievable throughput

                                                                                                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                  Causescosts of congestion scenario 2

                                                                                                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                  λin λout=

                                                                                                                                                                  λin λoutgtλ

                                                                                                                                                                  inλout

                                                                                                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                  (c)(a) (b)

                                                                                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                  λin

                                                                                                                                                                  Q what happens as and increase λ

                                                                                                                                                                  in

                                                                                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                  Causescosts of congestion scenario 3

                                                                                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                  Approaches towards congestion control

                                                                                                                                                                  Two broad approaches towards congestion control

                                                                                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                                                                                  small exception ndash see next page

                                                                                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                  sender should use available bandwidth

                                                                                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                  Chapter 3 outline

                                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                  Congwin

                                                                                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                  cut CongWin in half after loss event

                                                                                                                                                                  8 Kbytes

                                                                                                                                                                  16 Kbytes

                                                                                                                                                                  24 Kbytes

                                                                                                                                                                  time

                                                                                                                                                                  congestionwindow

                                                                                                                                                                  Long-lived TCP connection

                                                                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                  TCP Slow Start

                                                                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                  TCP Slow Start (more)

                                                                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                  Host A

                                                                                                                                                                  one segment

                                                                                                                                                                  RTT

                                                                                                                                                                  Host B

                                                                                                                                                                  time

                                                                                                                                                                  two segments

                                                                                                                                                                  four segments

                                                                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                  Summary TCP Congestion Control

                                                                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                  The Big Picture

                                                                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                  Slow Start (SS)

                                                                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                  CongestionAvoidance (CA)

                                                                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                  Enter slow start

                                                                                                                                                                  Duplicate ACK

                                                                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                  CongWin and Threshold not changed

                                                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                  TCP throughput

                                                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                  TCP Futures

                                                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                  LRTTMSSsdot221

                                                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                  TCP connection 1

                                                                                                                                                                  bottleneckrouter

                                                                                                                                                                  capacity R

                                                                                                                                                                  TCP connection 2

                                                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                  R

                                                                                                                                                                  R

                                                                                                                                                                  equal bandwidth share

                                                                                                                                                                  Connection 1 throughput

                                                                                                                                                                  Conn

                                                                                                                                                                  ecti

                                                                                                                                                                  on 2

                                                                                                                                                                  thr

                                                                                                                                                                  ough

                                                                                                                                                                  p ut

                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                  modeling slow start

                                                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                  Fixed congestion window (1)

                                                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                  latency = 2RTT + OR

                                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                                  RS

                                                                                                                                                                  RSRTTP

                                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                                  RTT

                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                  requestobject

                                                                                                                                                                  first window= SR

                                                                                                                                                                  second window= 2SR

                                                                                                                                                                  third window= 4SR

                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                  delivered

                                                                                                                                                                  time atclient

                                                                                                                                                                  time atserver

                                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                  Server idles P=2 times

                                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                  RS

                                                                                                                                                                  RSRTTPRTT

                                                                                                                                                                  RO

                                                                                                                                                                  RSRTT

                                                                                                                                                                  RSRTT

                                                                                                                                                                  RO

                                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                                  P

                                                                                                                                                                  kP

                                                                                                                                                                  k

                                                                                                                                                                  P

                                                                                                                                                                  pp

                                                                                                                                                                  )12(][2

                                                                                                                                                                  ]2[2

                                                                                                                                                                  2delay

                                                                                                                                                                  1

                                                                                                                                                                  1

                                                                                                                                                                  1

                                                                                                                                                                  minusminus+++=

                                                                                                                                                                  minus+++=

                                                                                                                                                                  ++=

                                                                                                                                                                  minus

                                                                                                                                                                  =

                                                                                                                                                                  =

                                                                                                                                                                  sum

                                                                                                                                                                  sum

                                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                                  +minus

                                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                                  RSk

                                                                                                                                                                  RTT

                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                  requestobject

                                                                                                                                                                  first window= SR

                                                                                                                                                                  second window= 2SR

                                                                                                                                                                  third window= 4SR

                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                  delivered

                                                                                                                                                                  time atclient

                                                                                                                                                                  time atserver

                                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                  How do we calculate K

                                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                                  +ge=

                                                                                                                                                                  geminus=

                                                                                                                                                                  ge+++=

                                                                                                                                                                  ge+++=minus

                                                                                                                                                                  minus

                                                                                                                                                                  )1(log

                                                                                                                                                                  )1(logmin

                                                                                                                                                                  12min

                                                                                                                                                                  222min222min

                                                                                                                                                                  2

                                                                                                                                                                  2

                                                                                                                                                                  110

                                                                                                                                                                  110

                                                                                                                                                                  SO

                                                                                                                                                                  SOkk

                                                                                                                                                                  SOk

                                                                                                                                                                  SOkOSSSkK

                                                                                                                                                                  k

                                                                                                                                                                  k

                                                                                                                                                                  k

                                                                                                                                                                  L

                                                                                                                                                                  L

                                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                  02468

                                                                                                                                                                  101214161820

                                                                                                                                                                  28Kbps

                                                                                                                                                                  100Kbps

                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                  non-persistent

                                                                                                                                                                  persistent

                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                                  0

                                                                                                                                                                  10

                                                                                                                                                                  20

                                                                                                                                                                  30

                                                                                                                                                                  40

                                                                                                                                                                  50

                                                                                                                                                                  60

                                                                                                                                                                  70

                                                                                                                                                                  28Kbps

                                                                                                                                                                  100Kbps

                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                  non-persistent

                                                                                                                                                                  persistent

                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                  UDPTCP

                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                  • UDP more
                                                                                                                                                                  • UDP checksum
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                  • GBN Sender
                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                  • More on receiver
                                                                                                                                                                  • GBN inaction
                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                  • Selective repeat
                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                  • More TCP Details
                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                  • TCP sender events
                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                  • Technical Issue
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                  • A few special cases
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                  • The Big Picture
                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                  • TCP throughput
                                                                                                                                                                  • TCP Futures
                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                    3 Transport Layer 82Comp 361 Spring 2005

                                                                                                                                                                    TCP Flow control how it works

                                                                                                                                                                    (Suppose TCP receiver discards out-of-order segments)spare room in buffer

                                                                                                                                                                    = RcvWindow= RcvBuffer-[LastByteRcvd -

                                                                                                                                                                    LastByteRead]

                                                                                                                                                                    Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKeddata to RcvWindow

                                                                                                                                                                    guarantees receive buffer doesnrsquot overflow

                                                                                                                                                                    3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                                    Technical Issue

                                                                                                                                                                    Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                                    Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                                    3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                                    Note on UDP

                                                                                                                                                                    UDP has no flow control

                                                                                                                                                                    UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                                    3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                                    Chapter 3 outline

                                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                    3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                    TCP Connection Management

                                                                                                                                                                    Three way handshakeStep 1 client end system sends

                                                                                                                                                                    TCP SYN control segment to server

                                                                                                                                                                    specifies client_isn the initial seq No application data

                                                                                                                                                                    Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                    ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                    Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                    seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                    client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                    server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                    3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                                    Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                    Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                    SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                    clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                    server

                                                                                                                                                                    Connection granted (SYN=1 server_isn

                                                                                                                                                                    ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                    ack=client_isn+1)

                                                                                                                                                                    ack=server_isn+1

                                                                                                                                                                    3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                                    Closing a connection

                                                                                                                                                                    client closes socketclientSocketclose()

                                                                                                                                                                    Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                    Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                    client

                                                                                                                                                                    FIN

                                                                                                                                                                    server

                                                                                                                                                                    ACK

                                                                                                                                                                    ACK

                                                                                                                                                                    FIN

                                                                                                                                                                    close

                                                                                                                                                                    close

                                                                                                                                                                    closed

                                                                                                                                                                    tim

                                                                                                                                                                    ed w

                                                                                                                                                                    ait

                                                                                                                                                                    3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                                    Step 3 client receives FIN replies with ACK

                                                                                                                                                                    Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                    Closes down after timed-wait

                                                                                                                                                                    Step 4 server receives ACK Connection closed

                                                                                                                                                                    Note with small modification can handle simultaneous FINs

                                                                                                                                                                    client

                                                                                                                                                                    FIN

                                                                                                                                                                    server

                                                                                                                                                                    ACK

                                                                                                                                                                    ACK

                                                                                                                                                                    FIN

                                                                                                                                                                    closing

                                                                                                                                                                    closing

                                                                                                                                                                    closed

                                                                                                                                                                    tim

                                                                                                                                                                    ed w

                                                                                                                                                                    ait

                                                                                                                                                                    closed

                                                                                                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                                    ExampleTCP serverlifecycle

                                                                                                                                                                    Example TCP clientlifecycle

                                                                                                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                    A few special cases

                                                                                                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                    Chapter 3 outline

                                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                    Principles of Congestion Control

                                                                                                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                    a top-10 problem

                                                                                                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                    large delays when congestedmaximum achievable throughput

                                                                                                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                    Causescosts of congestion scenario 2

                                                                                                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                    λin λout=

                                                                                                                                                                    λin λoutgtλ

                                                                                                                                                                    inλout

                                                                                                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                    (c)(a) (b)

                                                                                                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                    λin

                                                                                                                                                                    Q what happens as and increase λ

                                                                                                                                                                    in

                                                                                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                    Causescosts of congestion scenario 3

                                                                                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                    Approaches towards congestion control

                                                                                                                                                                    Two broad approaches towards congestion control

                                                                                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                                                                                    small exception ndash see next page

                                                                                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                    sender should use available bandwidth

                                                                                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                    Chapter 3 outline

                                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                    Congwin

                                                                                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                    cut CongWin in half after loss event

                                                                                                                                                                    8 Kbytes

                                                                                                                                                                    16 Kbytes

                                                                                                                                                                    24 Kbytes

                                                                                                                                                                    time

                                                                                                                                                                    congestionwindow

                                                                                                                                                                    Long-lived TCP connection

                                                                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                    TCP Slow Start

                                                                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                    TCP Slow Start (more)

                                                                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                    Host A

                                                                                                                                                                    one segment

                                                                                                                                                                    RTT

                                                                                                                                                                    Host B

                                                                                                                                                                    time

                                                                                                                                                                    two segments

                                                                                                                                                                    four segments

                                                                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                    Summary TCP Congestion Control

                                                                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                    The Big Picture

                                                                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                    Slow Start (SS)

                                                                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                    CongestionAvoidance (CA)

                                                                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                    Enter slow start

                                                                                                                                                                    Duplicate ACK

                                                                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                    CongWin and Threshold not changed

                                                                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                    TCP throughput

                                                                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                    TCP Futures

                                                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                    LRTTMSSsdot221

                                                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                    TCP connection 1

                                                                                                                                                                    bottleneckrouter

                                                                                                                                                                    capacity R

                                                                                                                                                                    TCP connection 2

                                                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                    R

                                                                                                                                                                    R

                                                                                                                                                                    equal bandwidth share

                                                                                                                                                                    Connection 1 throughput

                                                                                                                                                                    Conn

                                                                                                                                                                    ecti

                                                                                                                                                                    on 2

                                                                                                                                                                    thr

                                                                                                                                                                    ough

                                                                                                                                                                    p ut

                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                    modeling slow start

                                                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                    Fixed congestion window (1)

                                                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                    latency = 2RTT + OR

                                                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                    Fixed congestion window (2)

                                                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                                    RS

                                                                                                                                                                    RSRTTP

                                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                                    RTT

                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                    requestobject

                                                                                                                                                                    first window= SR

                                                                                                                                                                    second window= 2SR

                                                                                                                                                                    third window= 4SR

                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                    delivered

                                                                                                                                                                    time atclient

                                                                                                                                                                    time atserver

                                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                    Server idles P=2 times

                                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                    RS

                                                                                                                                                                    RSRTTPRTT

                                                                                                                                                                    RO

                                                                                                                                                                    RSRTT

                                                                                                                                                                    RSRTT

                                                                                                                                                                    RO

                                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                                    P

                                                                                                                                                                    kP

                                                                                                                                                                    k

                                                                                                                                                                    P

                                                                                                                                                                    pp

                                                                                                                                                                    )12(][2

                                                                                                                                                                    ]2[2

                                                                                                                                                                    2delay

                                                                                                                                                                    1

                                                                                                                                                                    1

                                                                                                                                                                    1

                                                                                                                                                                    minusminus+++=

                                                                                                                                                                    minus+++=

                                                                                                                                                                    ++=

                                                                                                                                                                    minus

                                                                                                                                                                    =

                                                                                                                                                                    =

                                                                                                                                                                    sum

                                                                                                                                                                    sum

                                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                                    +minus

                                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                                    RSk

                                                                                                                                                                    RTT

                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                    requestobject

                                                                                                                                                                    first window= SR

                                                                                                                                                                    second window= 2SR

                                                                                                                                                                    third window= 4SR

                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                    delivered

                                                                                                                                                                    time atclient

                                                                                                                                                                    time atserver

                                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                    How do we calculate K

                                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                                    +ge=

                                                                                                                                                                    geminus=

                                                                                                                                                                    ge+++=

                                                                                                                                                                    ge+++=minus

                                                                                                                                                                    minus

                                                                                                                                                                    )1(log

                                                                                                                                                                    )1(logmin

                                                                                                                                                                    12min

                                                                                                                                                                    222min222min

                                                                                                                                                                    2

                                                                                                                                                                    2

                                                                                                                                                                    110

                                                                                                                                                                    110

                                                                                                                                                                    SO

                                                                                                                                                                    SOkk

                                                                                                                                                                    SOk

                                                                                                                                                                    SOkOSSSkK

                                                                                                                                                                    k

                                                                                                                                                                    k

                                                                                                                                                                    k

                                                                                                                                                                    L

                                                                                                                                                                    L

                                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                    02468

                                                                                                                                                                    101214161820

                                                                                                                                                                    28Kbps

                                                                                                                                                                    100Kbps

                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                    non-persistent

                                                                                                                                                                    persistent

                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                                    0

                                                                                                                                                                    10

                                                                                                                                                                    20

                                                                                                                                                                    30

                                                                                                                                                                    40

                                                                                                                                                                    50

                                                                                                                                                                    60

                                                                                                                                                                    70

                                                                                                                                                                    28Kbps

                                                                                                                                                                    100Kbps

                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                    non-persistent

                                                                                                                                                                    persistent

                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                                    UDPTCP

                                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • Transport services and protocols
                                                                                                                                                                    • Transport vs network layer
                                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                    • How demultiplexing works
                                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                    • UDP more
                                                                                                                                                                    • UDP checksum
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                    • Incremental Improvements
                                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                    • rdt21 discussion
                                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                                    • rdt30 sender
                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                    • Performance of rdt30
                                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                                    • Go-Back-N
                                                                                                                                                                    • GBN Sender
                                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                                    • More on receiver
                                                                                                                                                                    • GBN inaction
                                                                                                                                                                    • Selective Repeat
                                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                                    • Selective repeat
                                                                                                                                                                    • Selective repeat in action
                                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                    • More TCP Details
                                                                                                                                                                    • Even More TCP Details
                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                    • Example RTT estimation
                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                                    • TCP sender events
                                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                    • More on Sender Policies
                                                                                                                                                                    • Fast Retransmit
                                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                                    • Technical Issue
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • TCP Connection Management
                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                    • A few special cases
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                                    • TCP AIMD
                                                                                                                                                                    • TCP Slow Start
                                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                                    • The Big Picture
                                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                                    • TCP throughput
                                                                                                                                                                    • TCP Futures
                                                                                                                                                                    • TCP Fairness
                                                                                                                                                                    • Why is TCP fair
                                                                                                                                                                    • Fairness (more)
                                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                                    • HTTP Modeling
                                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                                      3 Transport Layer 83Comp 361 Spring 2005

                                                                                                                                                                      Technical Issue

                                                                                                                                                                      Suppose RcvWindow=0 and that receiver has already ACKrsquod ALL packets in bufferSender does not transmit new packets until it hears RcvWindowgt0Receiver never sends RcvWindowgt0 since it has no new ACKS to send to SenderDEADLOCK

                                                                                                                                                                      Solution TCP specs require sender to continue sending packets with one data byte while RcvWindow=0 just to keep receiving ACKS from B At some point the receiverrsquos buffer will empty and RcvWindowgt0 will be transmitted back to sender

                                                                                                                                                                      3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                                      Note on UDP

                                                                                                                                                                      UDP has no flow control

                                                                                                                                                                      UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                                      3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                                      Chapter 3 outline

                                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                      3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                      TCP Connection Management

                                                                                                                                                                      Three way handshakeStep 1 client end system sends

                                                                                                                                                                      TCP SYN control segment to server

                                                                                                                                                                      specifies client_isn the initial seq No application data

                                                                                                                                                                      Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                      ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                      Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                      seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                      client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                      server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                      3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                                      Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                      Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                      SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                      clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                      server

                                                                                                                                                                      Connection granted (SYN=1 server_isn

                                                                                                                                                                      ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                      ack=client_isn+1)

                                                                                                                                                                      ack=server_isn+1

                                                                                                                                                                      3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                                      Closing a connection

                                                                                                                                                                      client closes socketclientSocketclose()

                                                                                                                                                                      Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                      Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                      client

                                                                                                                                                                      FIN

                                                                                                                                                                      server

                                                                                                                                                                      ACK

                                                                                                                                                                      ACK

                                                                                                                                                                      FIN

                                                                                                                                                                      close

                                                                                                                                                                      close

                                                                                                                                                                      closed

                                                                                                                                                                      tim

                                                                                                                                                                      ed w

                                                                                                                                                                      ait

                                                                                                                                                                      3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                                      Step 3 client receives FIN replies with ACK

                                                                                                                                                                      Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                      Closes down after timed-wait

                                                                                                                                                                      Step 4 server receives ACK Connection closed

                                                                                                                                                                      Note with small modification can handle simultaneous FINs

                                                                                                                                                                      client

                                                                                                                                                                      FIN

                                                                                                                                                                      server

                                                                                                                                                                      ACK

                                                                                                                                                                      ACK

                                                                                                                                                                      FIN

                                                                                                                                                                      closing

                                                                                                                                                                      closing

                                                                                                                                                                      closed

                                                                                                                                                                      tim

                                                                                                                                                                      ed w

                                                                                                                                                                      ait

                                                                                                                                                                      closed

                                                                                                                                                                      3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                      TCP Connection Management (cont)

                                                                                                                                                                      ExampleTCP serverlifecycle

                                                                                                                                                                      Example TCP clientlifecycle

                                                                                                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                      A few special cases

                                                                                                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                      Chapter 3 outline

                                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                      Principles of Congestion Control

                                                                                                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                      a top-10 problem

                                                                                                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                      large delays when congestedmaximum achievable throughput

                                                                                                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                      Causescosts of congestion scenario 2

                                                                                                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                      λin λout=

                                                                                                                                                                      λin λoutgtλ

                                                                                                                                                                      inλout

                                                                                                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                      (c)(a) (b)

                                                                                                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                      λin

                                                                                                                                                                      Q what happens as and increase λ

                                                                                                                                                                      in

                                                                                                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                      Causescosts of congestion scenario 3

                                                                                                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                      Approaches towards congestion control

                                                                                                                                                                      Two broad approaches towards congestion control

                                                                                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                                                                                      small exception ndash see next page

                                                                                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                      sender should use available bandwidth

                                                                                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                      Chapter 3 outline

                                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                      Congwin

                                                                                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                      cut CongWin in half after loss event

                                                                                                                                                                      8 Kbytes

                                                                                                                                                                      16 Kbytes

                                                                                                                                                                      24 Kbytes

                                                                                                                                                                      time

                                                                                                                                                                      congestionwindow

                                                                                                                                                                      Long-lived TCP connection

                                                                                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                      TCP Slow Start

                                                                                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                      TCP Slow Start (more)

                                                                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                      Host A

                                                                                                                                                                      one segment

                                                                                                                                                                      RTT

                                                                                                                                                                      Host B

                                                                                                                                                                      time

                                                                                                                                                                      two segments

                                                                                                                                                                      four segments

                                                                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                      Summary TCP Congestion Control

                                                                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                      The Big Picture

                                                                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                      Slow Start (SS)

                                                                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                      CongestionAvoidance (CA)

                                                                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                      Enter slow start

                                                                                                                                                                      Duplicate ACK

                                                                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                      CongWin and Threshold not changed

                                                                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                      TCP throughput

                                                                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                      TCP Futures

                                                                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                      LRTTMSSsdot221

                                                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                      TCP connection 1

                                                                                                                                                                      bottleneckrouter

                                                                                                                                                                      capacity R

                                                                                                                                                                      TCP connection 2

                                                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                      R

                                                                                                                                                                      R

                                                                                                                                                                      equal bandwidth share

                                                                                                                                                                      Connection 1 throughput

                                                                                                                                                                      Conn

                                                                                                                                                                      ecti

                                                                                                                                                                      on 2

                                                                                                                                                                      thr

                                                                                                                                                                      ough

                                                                                                                                                                      p ut

                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                      modeling slow start

                                                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                      Fixed congestion window (1)

                                                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                      latency = 2RTT + OR

                                                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                      Fixed congestion window (2)

                                                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                      Will show that the delay for one object is

                                                                                                                                                                      RS

                                                                                                                                                                      RSRTTP

                                                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                                      RTT

                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                      requestobject

                                                                                                                                                                      first window= SR

                                                                                                                                                                      second window= 2SR

                                                                                                                                                                      third window= 4SR

                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                      delivered

                                                                                                                                                                      time atclient

                                                                                                                                                                      time atserver

                                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                      Server idles P=2 times

                                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                      RS

                                                                                                                                                                      RSRTTPRTT

                                                                                                                                                                      RO

                                                                                                                                                                      RSRTT

                                                                                                                                                                      RSRTT

                                                                                                                                                                      RO

                                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                                      P

                                                                                                                                                                      kP

                                                                                                                                                                      k

                                                                                                                                                                      P

                                                                                                                                                                      pp

                                                                                                                                                                      )12(][2

                                                                                                                                                                      ]2[2

                                                                                                                                                                      2delay

                                                                                                                                                                      1

                                                                                                                                                                      1

                                                                                                                                                                      1

                                                                                                                                                                      minusminus+++=

                                                                                                                                                                      minus+++=

                                                                                                                                                                      ++=

                                                                                                                                                                      minus

                                                                                                                                                                      =

                                                                                                                                                                      =

                                                                                                                                                                      sum

                                                                                                                                                                      sum

                                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                                      +minus

                                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                                      RSk

                                                                                                                                                                      RTT

                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                      requestobject

                                                                                                                                                                      first window= SR

                                                                                                                                                                      second window= 2SR

                                                                                                                                                                      third window= 4SR

                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                      delivered

                                                                                                                                                                      time atclient

                                                                                                                                                                      time atserver

                                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                      How do we calculate K

                                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                                      +ge=

                                                                                                                                                                      geminus=

                                                                                                                                                                      ge+++=

                                                                                                                                                                      ge+++=minus

                                                                                                                                                                      minus

                                                                                                                                                                      )1(log

                                                                                                                                                                      )1(logmin

                                                                                                                                                                      12min

                                                                                                                                                                      222min222min

                                                                                                                                                                      2

                                                                                                                                                                      2

                                                                                                                                                                      110

                                                                                                                                                                      110

                                                                                                                                                                      SO

                                                                                                                                                                      SOkk

                                                                                                                                                                      SOk

                                                                                                                                                                      SOkOSSSkK

                                                                                                                                                                      k

                                                                                                                                                                      k

                                                                                                                                                                      k

                                                                                                                                                                      L

                                                                                                                                                                      L

                                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                      02468

                                                                                                                                                                      101214161820

                                                                                                                                                                      28Kbps

                                                                                                                                                                      100Kbps

                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                      non-persistent

                                                                                                                                                                      persistent

                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                                      0

                                                                                                                                                                      10

                                                                                                                                                                      20

                                                                                                                                                                      30

                                                                                                                                                                      40

                                                                                                                                                                      50

                                                                                                                                                                      60

                                                                                                                                                                      70

                                                                                                                                                                      28Kbps

                                                                                                                                                                      100Kbps

                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                      non-persistent

                                                                                                                                                                      persistent

                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                                      UDPTCP

                                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • Transport services and protocols
                                                                                                                                                                      • Transport vs network layer
                                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                      • How demultiplexing works
                                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                      • UDP more
                                                                                                                                                                      • UDP checksum
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                      • Incremental Improvements
                                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                      • rdt21 discussion
                                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                                      • rdt30 sender
                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                      • Performance of rdt30
                                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                                      • Go-Back-N
                                                                                                                                                                      • GBN Sender
                                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                                      • More on receiver
                                                                                                                                                                      • GBN inaction
                                                                                                                                                                      • Selective Repeat
                                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                                      • Selective repeat
                                                                                                                                                                      • Selective repeat in action
                                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                      • More TCP Details
                                                                                                                                                                      • Even More TCP Details
                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                      • Example RTT estimation
                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                                      • TCP sender events
                                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                      • More on Sender Policies
                                                                                                                                                                      • Fast Retransmit
                                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                                      • Technical Issue
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • TCP Connection Management
                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                      • A few special cases
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                                      • TCP AIMD
                                                                                                                                                                      • TCP Slow Start
                                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                                      • The Big Picture
                                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                                      • TCP throughput
                                                                                                                                                                      • TCP Futures
                                                                                                                                                                      • TCP Fairness
                                                                                                                                                                      • Why is TCP fair
                                                                                                                                                                      • Fairness (more)
                                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                                      • HTTP Modeling
                                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                                        3 Transport Layer 84Comp 361 Spring 2005

                                                                                                                                                                        Note on UDP

                                                                                                                                                                        UDP has no flow control

                                                                                                                                                                        UDP appends packets to receiving socketrsquos buffer If buffer is full then packets are lost

                                                                                                                                                                        3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                                        Chapter 3 outline

                                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                        3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                        TCP Connection Management

                                                                                                                                                                        Three way handshakeStep 1 client end system sends

                                                                                                                                                                        TCP SYN control segment to server

                                                                                                                                                                        specifies client_isn the initial seq No application data

                                                                                                                                                                        Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                        ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                        Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                        seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                        client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                        server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                        3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                                        Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                        Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                        SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                        clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                        server

                                                                                                                                                                        Connection granted (SYN=1 server_isn

                                                                                                                                                                        ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                        ack=client_isn+1)

                                                                                                                                                                        ack=server_isn+1

                                                                                                                                                                        3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                                        Closing a connection

                                                                                                                                                                        client closes socketclientSocketclose()

                                                                                                                                                                        Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                        Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                        client

                                                                                                                                                                        FIN

                                                                                                                                                                        server

                                                                                                                                                                        ACK

                                                                                                                                                                        ACK

                                                                                                                                                                        FIN

                                                                                                                                                                        close

                                                                                                                                                                        close

                                                                                                                                                                        closed

                                                                                                                                                                        tim

                                                                                                                                                                        ed w

                                                                                                                                                                        ait

                                                                                                                                                                        3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                                        Step 3 client receives FIN replies with ACK

                                                                                                                                                                        Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                        Closes down after timed-wait

                                                                                                                                                                        Step 4 server receives ACK Connection closed

                                                                                                                                                                        Note with small modification can handle simultaneous FINs

                                                                                                                                                                        client

                                                                                                                                                                        FIN

                                                                                                                                                                        server

                                                                                                                                                                        ACK

                                                                                                                                                                        ACK

                                                                                                                                                                        FIN

                                                                                                                                                                        closing

                                                                                                                                                                        closing

                                                                                                                                                                        closed

                                                                                                                                                                        tim

                                                                                                                                                                        ed w

                                                                                                                                                                        ait

                                                                                                                                                                        closed

                                                                                                                                                                        3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                        TCP Connection Management (cont)

                                                                                                                                                                        ExampleTCP serverlifecycle

                                                                                                                                                                        Example TCP clientlifecycle

                                                                                                                                                                        3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                        A few special cases

                                                                                                                                                                        Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                        It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                        Chapter 3 outline

                                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                        Principles of Congestion Control

                                                                                                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                        a top-10 problem

                                                                                                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                        large delays when congestedmaximum achievable throughput

                                                                                                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                        Causescosts of congestion scenario 2

                                                                                                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                        λin λout=

                                                                                                                                                                        λin λoutgtλ

                                                                                                                                                                        inλout

                                                                                                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                        (c)(a) (b)

                                                                                                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                        λin

                                                                                                                                                                        Q what happens as and increase λ

                                                                                                                                                                        in

                                                                                                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                        Causescosts of congestion scenario 3

                                                                                                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                        Approaches towards congestion control

                                                                                                                                                                        Two broad approaches towards congestion control

                                                                                                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                                                                                        small exception ndash see next page

                                                                                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                        sender should use available bandwidth

                                                                                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                        Chapter 3 outline

                                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                        Congwin

                                                                                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                        cut CongWin in half after loss event

                                                                                                                                                                        8 Kbytes

                                                                                                                                                                        16 Kbytes

                                                                                                                                                                        24 Kbytes

                                                                                                                                                                        time

                                                                                                                                                                        congestionwindow

                                                                                                                                                                        Long-lived TCP connection

                                                                                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                        TCP Slow Start

                                                                                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                        TCP Slow Start (more)

                                                                                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                        Host A

                                                                                                                                                                        one segment

                                                                                                                                                                        RTT

                                                                                                                                                                        Host B

                                                                                                                                                                        time

                                                                                                                                                                        two segments

                                                                                                                                                                        four segments

                                                                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                        Summary TCP Congestion Control

                                                                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                        The Big Picture

                                                                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                        Slow Start (SS)

                                                                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                        CongestionAvoidance (CA)

                                                                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                        Enter slow start

                                                                                                                                                                        Duplicate ACK

                                                                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                        CongWin and Threshold not changed

                                                                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                        TCP throughput

                                                                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                        TCP Futures

                                                                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                        LRTTMSSsdot221

                                                                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                        TCP connection 1

                                                                                                                                                                        bottleneckrouter

                                                                                                                                                                        capacity R

                                                                                                                                                                        TCP connection 2

                                                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                        R

                                                                                                                                                                        R

                                                                                                                                                                        equal bandwidth share

                                                                                                                                                                        Connection 1 throughput

                                                                                                                                                                        Conn

                                                                                                                                                                        ecti

                                                                                                                                                                        on 2

                                                                                                                                                                        thr

                                                                                                                                                                        ough

                                                                                                                                                                        p ut

                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                        modeling slow start

                                                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                        Fixed congestion window (1)

                                                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                        latency = 2RTT + OR

                                                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                        Fixed congestion window (2)

                                                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                        Will show that the delay for one object is

                                                                                                                                                                        RS

                                                                                                                                                                        RSRTTP

                                                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                                                        RTT

                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                        requestobject

                                                                                                                                                                        first window= SR

                                                                                                                                                                        second window= 2SR

                                                                                                                                                                        third window= 4SR

                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                        delivered

                                                                                                                                                                        time atclient

                                                                                                                                                                        time atserver

                                                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                        Server idles P=2 times

                                                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                        RS

                                                                                                                                                                        RSRTTPRTT

                                                                                                                                                                        RO

                                                                                                                                                                        RSRTT

                                                                                                                                                                        RSRTT

                                                                                                                                                                        RO

                                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                                        P

                                                                                                                                                                        kP

                                                                                                                                                                        k

                                                                                                                                                                        P

                                                                                                                                                                        pp

                                                                                                                                                                        )12(][2

                                                                                                                                                                        ]2[2

                                                                                                                                                                        2delay

                                                                                                                                                                        1

                                                                                                                                                                        1

                                                                                                                                                                        1

                                                                                                                                                                        minusminus+++=

                                                                                                                                                                        minus+++=

                                                                                                                                                                        ++=

                                                                                                                                                                        minus

                                                                                                                                                                        =

                                                                                                                                                                        =

                                                                                                                                                                        sum

                                                                                                                                                                        sum

                                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                                        +minus

                                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                                        RSk

                                                                                                                                                                        RTT

                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                        requestobject

                                                                                                                                                                        first window= SR

                                                                                                                                                                        second window= 2SR

                                                                                                                                                                        third window= 4SR

                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                        delivered

                                                                                                                                                                        time atclient

                                                                                                                                                                        time atserver

                                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                        How do we calculate K

                                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                                        +ge=

                                                                                                                                                                        geminus=

                                                                                                                                                                        ge+++=

                                                                                                                                                                        ge+++=minus

                                                                                                                                                                        minus

                                                                                                                                                                        )1(log

                                                                                                                                                                        )1(logmin

                                                                                                                                                                        12min

                                                                                                                                                                        222min222min

                                                                                                                                                                        2

                                                                                                                                                                        2

                                                                                                                                                                        110

                                                                                                                                                                        110

                                                                                                                                                                        SO

                                                                                                                                                                        SOkk

                                                                                                                                                                        SOk

                                                                                                                                                                        SOkOSSSkK

                                                                                                                                                                        k

                                                                                                                                                                        k

                                                                                                                                                                        k

                                                                                                                                                                        L

                                                                                                                                                                        L

                                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                        02468

                                                                                                                                                                        101214161820

                                                                                                                                                                        28Kbps

                                                                                                                                                                        100Kbps

                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                        non-persistent

                                                                                                                                                                        persistent

                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                                        0

                                                                                                                                                                        10

                                                                                                                                                                        20

                                                                                                                                                                        30

                                                                                                                                                                        40

                                                                                                                                                                        50

                                                                                                                                                                        60

                                                                                                                                                                        70

                                                                                                                                                                        28Kbps

                                                                                                                                                                        100Kbps

                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                        non-persistent

                                                                                                                                                                        persistent

                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                                        UDPTCP

                                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • Transport services and protocols
                                                                                                                                                                        • Transport vs network layer
                                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                        • How demultiplexing works
                                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                        • UDP more
                                                                                                                                                                        • UDP checksum
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                        • Incremental Improvements
                                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                        • rdt21 discussion
                                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                                        • rdt30 sender
                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                        • Performance of rdt30
                                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                                        • Go-Back-N
                                                                                                                                                                        • GBN Sender
                                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                                        • More on receiver
                                                                                                                                                                        • GBN inaction
                                                                                                                                                                        • Selective Repeat
                                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                                        • Selective repeat
                                                                                                                                                                        • Selective repeat in action
                                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                        • More TCP Details
                                                                                                                                                                        • Even More TCP Details
                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                        • Example RTT estimation
                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                                        • TCP sender events
                                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                        • More on Sender Policies
                                                                                                                                                                        • Fast Retransmit
                                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                                        • Technical Issue
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • TCP Connection Management
                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                        • A few special cases
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                                        • TCP AIMD
                                                                                                                                                                        • TCP Slow Start
                                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                                        • The Big Picture
                                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                                        • TCP throughput
                                                                                                                                                                        • TCP Futures
                                                                                                                                                                        • TCP Fairness
                                                                                                                                                                        • Why is TCP fair
                                                                                                                                                                        • Fairness (more)
                                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                                        • HTTP Modeling
                                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                                          3 Transport Layer 85Comp 361 Spring 2005

                                                                                                                                                                          Chapter 3 outline

                                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                          3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                          TCP Connection Management

                                                                                                                                                                          Three way handshakeStep 1 client end system sends

                                                                                                                                                                          TCP SYN control segment to server

                                                                                                                                                                          specifies client_isn the initial seq No application data

                                                                                                                                                                          Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                          ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                          Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                          seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                          client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                          server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                          3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                                          Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                          Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                          SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                          clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                          server

                                                                                                                                                                          Connection granted (SYN=1 server_isn

                                                                                                                                                                          ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                          ack=client_isn+1)

                                                                                                                                                                          ack=server_isn+1

                                                                                                                                                                          3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                                          Closing a connection

                                                                                                                                                                          client closes socketclientSocketclose()

                                                                                                                                                                          Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                          Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                          client

                                                                                                                                                                          FIN

                                                                                                                                                                          server

                                                                                                                                                                          ACK

                                                                                                                                                                          ACK

                                                                                                                                                                          FIN

                                                                                                                                                                          close

                                                                                                                                                                          close

                                                                                                                                                                          closed

                                                                                                                                                                          tim

                                                                                                                                                                          ed w

                                                                                                                                                                          ait

                                                                                                                                                                          3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                                          Step 3 client receives FIN replies with ACK

                                                                                                                                                                          Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                          Closes down after timed-wait

                                                                                                                                                                          Step 4 server receives ACK Connection closed

                                                                                                                                                                          Note with small modification can handle simultaneous FINs

                                                                                                                                                                          client

                                                                                                                                                                          FIN

                                                                                                                                                                          server

                                                                                                                                                                          ACK

                                                                                                                                                                          ACK

                                                                                                                                                                          FIN

                                                                                                                                                                          closing

                                                                                                                                                                          closing

                                                                                                                                                                          closed

                                                                                                                                                                          tim

                                                                                                                                                                          ed w

                                                                                                                                                                          ait

                                                                                                                                                                          closed

                                                                                                                                                                          3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                          TCP Connection Management (cont)

                                                                                                                                                                          ExampleTCP serverlifecycle

                                                                                                                                                                          Example TCP clientlifecycle

                                                                                                                                                                          3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                          A few special cases

                                                                                                                                                                          Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                          It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                          3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                          Chapter 3 outline

                                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                          Principles of Congestion Control

                                                                                                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                          a top-10 problem

                                                                                                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                          large delays when congestedmaximum achievable throughput

                                                                                                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                          Causescosts of congestion scenario 2

                                                                                                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                          λin λout=

                                                                                                                                                                          λin λoutgtλ

                                                                                                                                                                          inλout

                                                                                                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                          (c)(a) (b)

                                                                                                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                          λin

                                                                                                                                                                          Q what happens as and increase λ

                                                                                                                                                                          in

                                                                                                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                          Causescosts of congestion scenario 3

                                                                                                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                          Approaches towards congestion control

                                                                                                                                                                          Two broad approaches towards congestion control

                                                                                                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                                                                                                          small exception ndash see next page

                                                                                                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                          sender should use available bandwidth

                                                                                                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                          Chapter 3 outline

                                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                          Congwin

                                                                                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                          cut CongWin in half after loss event

                                                                                                                                                                          8 Kbytes

                                                                                                                                                                          16 Kbytes

                                                                                                                                                                          24 Kbytes

                                                                                                                                                                          time

                                                                                                                                                                          congestionwindow

                                                                                                                                                                          Long-lived TCP connection

                                                                                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                          TCP Slow Start

                                                                                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                          TCP Slow Start (more)

                                                                                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                          Host A

                                                                                                                                                                          one segment

                                                                                                                                                                          RTT

                                                                                                                                                                          Host B

                                                                                                                                                                          time

                                                                                                                                                                          two segments

                                                                                                                                                                          four segments

                                                                                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                          Summary TCP Congestion Control

                                                                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                          The Big Picture

                                                                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                          Slow Start (SS)

                                                                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                          CongestionAvoidance (CA)

                                                                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                          Enter slow start

                                                                                                                                                                          Duplicate ACK

                                                                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                          CongWin and Threshold not changed

                                                                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                          TCP throughput

                                                                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                          TCP Futures

                                                                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                          LRTTMSSsdot221

                                                                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                          TCP connection 1

                                                                                                                                                                          bottleneckrouter

                                                                                                                                                                          capacity R

                                                                                                                                                                          TCP connection 2

                                                                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                          R

                                                                                                                                                                          R

                                                                                                                                                                          equal bandwidth share

                                                                                                                                                                          Connection 1 throughput

                                                                                                                                                                          Conn

                                                                                                                                                                          ecti

                                                                                                                                                                          on 2

                                                                                                                                                                          thr

                                                                                                                                                                          ough

                                                                                                                                                                          p ut

                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                          modeling slow start

                                                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                          Fixed congestion window (1)

                                                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                          latency = 2RTT + OR

                                                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                          Fixed congestion window (2)

                                                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                          Will show that the delay for one object is

                                                                                                                                                                          RS

                                                                                                                                                                          RSRTTP

                                                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                                                          RTT

                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                          requestobject

                                                                                                                                                                          first window= SR

                                                                                                                                                                          second window= 2SR

                                                                                                                                                                          third window= 4SR

                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                          delivered

                                                                                                                                                                          time atclient

                                                                                                                                                                          time atserver

                                                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                          Server idles P=2 times

                                                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                          RS

                                                                                                                                                                          RSRTTPRTT

                                                                                                                                                                          RO

                                                                                                                                                                          RSRTT

                                                                                                                                                                          RSRTT

                                                                                                                                                                          RO

                                                                                                                                                                          idleTimeRTTRO

                                                                                                                                                                          P

                                                                                                                                                                          kP

                                                                                                                                                                          k

                                                                                                                                                                          P

                                                                                                                                                                          pp

                                                                                                                                                                          )12(][2

                                                                                                                                                                          ]2[2

                                                                                                                                                                          2delay

                                                                                                                                                                          1

                                                                                                                                                                          1

                                                                                                                                                                          1

                                                                                                                                                                          minusminus+++=

                                                                                                                                                                          minus+++=

                                                                                                                                                                          ++=

                                                                                                                                                                          minus

                                                                                                                                                                          =

                                                                                                                                                                          =

                                                                                                                                                                          sum

                                                                                                                                                                          sum

                                                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                          RS k =⎥⎦

                                                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                                                          +minus

                                                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                                                          RSk

                                                                                                                                                                          RTT

                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                          requestobject

                                                                                                                                                                          first window= SR

                                                                                                                                                                          second window= 2SR

                                                                                                                                                                          third window= 4SR

                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                          delivered

                                                                                                                                                                          time atclient

                                                                                                                                                                          time atserver

                                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                          How do we calculate K

                                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                                          +ge=

                                                                                                                                                                          geminus=

                                                                                                                                                                          ge+++=

                                                                                                                                                                          ge+++=minus

                                                                                                                                                                          minus

                                                                                                                                                                          )1(log

                                                                                                                                                                          )1(logmin

                                                                                                                                                                          12min

                                                                                                                                                                          222min222min

                                                                                                                                                                          2

                                                                                                                                                                          2

                                                                                                                                                                          110

                                                                                                                                                                          110

                                                                                                                                                                          SO

                                                                                                                                                                          SOkk

                                                                                                                                                                          SOk

                                                                                                                                                                          SOkOSSSkK

                                                                                                                                                                          k

                                                                                                                                                                          k

                                                                                                                                                                          k

                                                                                                                                                                          L

                                                                                                                                                                          L

                                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                          02468

                                                                                                                                                                          101214161820

                                                                                                                                                                          28Kbps

                                                                                                                                                                          100Kbps

                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                          non-persistent

                                                                                                                                                                          persistent

                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                                          0

                                                                                                                                                                          10

                                                                                                                                                                          20

                                                                                                                                                                          30

                                                                                                                                                                          40

                                                                                                                                                                          50

                                                                                                                                                                          60

                                                                                                                                                                          70

                                                                                                                                                                          28Kbps

                                                                                                                                                                          100Kbps

                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                          non-persistent

                                                                                                                                                                          persistent

                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                                          UDPTCP

                                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • Transport services and protocols
                                                                                                                                                                          • Transport vs network layer
                                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                          • How demultiplexing works
                                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                          • UDP more
                                                                                                                                                                          • UDP checksum
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                          • Incremental Improvements
                                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                          • rdt21 discussion
                                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                                          • rdt30 sender
                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                          • Performance of rdt30
                                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                                          • Go-Back-N
                                                                                                                                                                          • GBN Sender
                                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                                          • More on receiver
                                                                                                                                                                          • GBN inaction
                                                                                                                                                                          • Selective Repeat
                                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                                          • Selective repeat
                                                                                                                                                                          • Selective repeat in action
                                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                          • More TCP Details
                                                                                                                                                                          • Even More TCP Details
                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                          • Example RTT estimation
                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                                          • TCP sender events
                                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                          • More on Sender Policies
                                                                                                                                                                          • Fast Retransmit
                                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                                          • Technical Issue
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • TCP Connection Management
                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                          • A few special cases
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                                          • TCP AIMD
                                                                                                                                                                          • TCP Slow Start
                                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                                          • The Big Picture
                                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                                          • TCP throughput
                                                                                                                                                                          • TCP Futures
                                                                                                                                                                          • TCP Fairness
                                                                                                                                                                          • Why is TCP fair
                                                                                                                                                                          • Fairness (more)
                                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                                          • HTTP Modeling
                                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                                            3 Transport Layer 86Comp 361 Spring 2005

                                                                                                                                                                            TCP Connection Management

                                                                                                                                                                            Three way handshakeStep 1 client end system sends

                                                                                                                                                                            TCP SYN control segment to server

                                                                                                                                                                            specifies client_isn the initial seq No application data

                                                                                                                                                                            Step 2 server end system receives SYN replies with SYNACK control segment

                                                                                                                                                                            ACKs received SYNallocates buffersReplies with client_isn+1 in ACK field to signal synchronizationSpecifies server_isnNo application data

                                                                                                                                                                            Recall TCP sender receiver establish ldquoconnectionrdquobefore exchanging data segmentsinitialize TCP variables

                                                                                                                                                                            seq sbuffers flow control info (eg RcvWindow)

                                                                                                                                                                            client connection initiatorSocket clientSocket = new Socket(hostnameport number)

                                                                                                                                                                            server contacted by clientSocket connectionSocket = welcomeSocketaccept()

                                                                                                                                                                            3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                                            Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                            Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                            SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                            clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                            server

                                                                                                                                                                            Connection granted (SYN=1 server_isn

                                                                                                                                                                            ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                            ack=client_isn+1)

                                                                                                                                                                            ack=server_isn+1

                                                                                                                                                                            3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                                            Closing a connection

                                                                                                                                                                            client closes socketclientSocketclose()

                                                                                                                                                                            Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                            Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                            client

                                                                                                                                                                            FIN

                                                                                                                                                                            server

                                                                                                                                                                            ACK

                                                                                                                                                                            ACK

                                                                                                                                                                            FIN

                                                                                                                                                                            close

                                                                                                                                                                            close

                                                                                                                                                                            closed

                                                                                                                                                                            tim

                                                                                                                                                                            ed w

                                                                                                                                                                            ait

                                                                                                                                                                            3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                                            Step 3 client receives FIN replies with ACK

                                                                                                                                                                            Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                            Closes down after timed-wait

                                                                                                                                                                            Step 4 server receives ACK Connection closed

                                                                                                                                                                            Note with small modification can handle simultaneous FINs

                                                                                                                                                                            client

                                                                                                                                                                            FIN

                                                                                                                                                                            server

                                                                                                                                                                            ACK

                                                                                                                                                                            ACK

                                                                                                                                                                            FIN

                                                                                                                                                                            closing

                                                                                                                                                                            closing

                                                                                                                                                                            closed

                                                                                                                                                                            tim

                                                                                                                                                                            ed w

                                                                                                                                                                            ait

                                                                                                                                                                            closed

                                                                                                                                                                            3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                            TCP Connection Management (cont)

                                                                                                                                                                            ExampleTCP serverlifecycle

                                                                                                                                                                            Example TCP clientlifecycle

                                                                                                                                                                            3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                            A few special cases

                                                                                                                                                                            Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                            It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                            3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                            Chapter 3 outline

                                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                            3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                            Principles of Congestion Control

                                                                                                                                                                            Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                            lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                            a top-10 problem

                                                                                                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                            large delays when congestedmaximum achievable throughput

                                                                                                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                            Causescosts of congestion scenario 2

                                                                                                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                            λin λout=

                                                                                                                                                                            λin λoutgtλ

                                                                                                                                                                            inλout

                                                                                                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                            (c)(a) (b)

                                                                                                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                            λin

                                                                                                                                                                            Q what happens as and increase λ

                                                                                                                                                                            in

                                                                                                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                            Causescosts of congestion scenario 3

                                                                                                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                            Approaches towards congestion control

                                                                                                                                                                            Two broad approaches towards congestion control

                                                                                                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                                                                                                            small exception ndash see next page

                                                                                                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                            sender should use available bandwidth

                                                                                                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                            Chapter 3 outline

                                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                            Congwin

                                                                                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                            cut CongWin in half after loss event

                                                                                                                                                                            8 Kbytes

                                                                                                                                                                            16 Kbytes

                                                                                                                                                                            24 Kbytes

                                                                                                                                                                            time

                                                                                                                                                                            congestionwindow

                                                                                                                                                                            Long-lived TCP connection

                                                                                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                            TCP Slow Start

                                                                                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                            TCP Slow Start (more)

                                                                                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                            Host A

                                                                                                                                                                            one segment

                                                                                                                                                                            RTT

                                                                                                                                                                            Host B

                                                                                                                                                                            time

                                                                                                                                                                            two segments

                                                                                                                                                                            four segments

                                                                                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                            Summary TCP Congestion Control

                                                                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                            The Big Picture

                                                                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                            Slow Start (SS)

                                                                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                            CongestionAvoidance (CA)

                                                                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                            Enter slow start

                                                                                                                                                                            Duplicate ACK

                                                                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                            CongWin and Threshold not changed

                                                                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                            TCP throughput

                                                                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                            TCP Futures

                                                                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                            LRTTMSSsdot221

                                                                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                            TCP connection 1

                                                                                                                                                                            bottleneckrouter

                                                                                                                                                                            capacity R

                                                                                                                                                                            TCP connection 2

                                                                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                            R

                                                                                                                                                                            R

                                                                                                                                                                            equal bandwidth share

                                                                                                                                                                            Connection 1 throughput

                                                                                                                                                                            Conn

                                                                                                                                                                            ecti

                                                                                                                                                                            on 2

                                                                                                                                                                            thr

                                                                                                                                                                            ough

                                                                                                                                                                            p ut

                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                                                                            do not want rate throttled by congestion control

                                                                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                            modeling slow start

                                                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                            Fixed congestion window (1)

                                                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                            latency = 2RTT + OR

                                                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                            Fixed congestion window (2)

                                                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                            Will show that the delay for one object is

                                                                                                                                                                            RS

                                                                                                                                                                            RSRTTP

                                                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                                                            RTT

                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                            requestobject

                                                                                                                                                                            first window= SR

                                                                                                                                                                            second window= 2SR

                                                                                                                                                                            third window= 4SR

                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                            delivered

                                                                                                                                                                            time atclient

                                                                                                                                                                            time atserver

                                                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                            Server idles P=2 times

                                                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                            RS

                                                                                                                                                                            RSRTTPRTT

                                                                                                                                                                            RO

                                                                                                                                                                            RSRTT

                                                                                                                                                                            RSRTT

                                                                                                                                                                            RO

                                                                                                                                                                            idleTimeRTTRO

                                                                                                                                                                            P

                                                                                                                                                                            kP

                                                                                                                                                                            k

                                                                                                                                                                            P

                                                                                                                                                                            pp

                                                                                                                                                                            )12(][2

                                                                                                                                                                            ]2[2

                                                                                                                                                                            2delay

                                                                                                                                                                            1

                                                                                                                                                                            1

                                                                                                                                                                            1

                                                                                                                                                                            minusminus+++=

                                                                                                                                                                            minus+++=

                                                                                                                                                                            ++=

                                                                                                                                                                            minus

                                                                                                                                                                            =

                                                                                                                                                                            =

                                                                                                                                                                            sum

                                                                                                                                                                            sum

                                                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                            RS k =⎥⎦

                                                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                                                            +minus

                                                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                                                            RSk

                                                                                                                                                                            RTT

                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                            requestobject

                                                                                                                                                                            first window= SR

                                                                                                                                                                            second window= 2SR

                                                                                                                                                                            third window= 4SR

                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                            delivered

                                                                                                                                                                            time atclient

                                                                                                                                                                            time atserver

                                                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                            How do we calculate K

                                                                                                                                                                            ⎥⎥⎤

                                                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                                                            +ge=

                                                                                                                                                                            geminus=

                                                                                                                                                                            ge+++=

                                                                                                                                                                            ge+++=minus

                                                                                                                                                                            minus

                                                                                                                                                                            )1(log

                                                                                                                                                                            )1(logmin

                                                                                                                                                                            12min

                                                                                                                                                                            222min222min

                                                                                                                                                                            2

                                                                                                                                                                            2

                                                                                                                                                                            110

                                                                                                                                                                            110

                                                                                                                                                                            SO

                                                                                                                                                                            SOkk

                                                                                                                                                                            SOk

                                                                                                                                                                            SOkOSSSkK

                                                                                                                                                                            k

                                                                                                                                                                            k

                                                                                                                                                                            k

                                                                                                                                                                            L

                                                                                                                                                                            L

                                                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                            02468

                                                                                                                                                                            101214161820

                                                                                                                                                                            28Kbps

                                                                                                                                                                            100Kbps

                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                            non-persistent

                                                                                                                                                                            persistent

                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                                            0

                                                                                                                                                                            10

                                                                                                                                                                            20

                                                                                                                                                                            30

                                                                                                                                                                            40

                                                                                                                                                                            50

                                                                                                                                                                            60

                                                                                                                                                                            70

                                                                                                                                                                            28Kbps

                                                                                                                                                                            100Kbps

                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                            non-persistent

                                                                                                                                                                            persistent

                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                                            UDPTCP

                                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • Transport services and protocols
                                                                                                                                                                            • Transport vs network layer
                                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                            • How demultiplexing works
                                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                            • UDP more
                                                                                                                                                                            • UDP checksum
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                            • Incremental Improvements
                                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                            • rdt21 discussion
                                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                                            • rdt30 sender
                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                            • Performance of rdt30
                                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                                            • Go-Back-N
                                                                                                                                                                            • GBN Sender
                                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                                            • More on receiver
                                                                                                                                                                            • GBN inaction
                                                                                                                                                                            • Selective Repeat
                                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                                            • Selective repeat
                                                                                                                                                                            • Selective repeat in action
                                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                            • More TCP Details
                                                                                                                                                                            • Even More TCP Details
                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                            • Example RTT estimation
                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                                            • TCP sender events
                                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                            • More on Sender Policies
                                                                                                                                                                            • Fast Retransmit
                                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                                            • Technical Issue
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • TCP Connection Management
                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                            • A few special cases
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                                            • TCP AIMD
                                                                                                                                                                            • TCP Slow Start
                                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                                            • The Big Picture
                                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                                            • TCP throughput
                                                                                                                                                                            • TCP Futures
                                                                                                                                                                            • TCP Fairness
                                                                                                                                                                            • Why is TCP fair
                                                                                                                                                                            • Fairness (more)
                                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                                            • HTTP Modeling
                                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                                              3 Transport Layer 87Comp 361 Spring 2005

                                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                                              Step 3 client end system receives SYNACK replies with SYN=0 and server_isn+1

                                                                                                                                                                              Allocate buffersAllocates buffersCan include application data

                                                                                                                                                                              SYN=0 signals that connection establishedserver_isn+1 signals that is synchronized

                                                                                                                                                                              clientConnection request (SYN=1 seq=client_isn)

                                                                                                                                                                              server

                                                                                                                                                                              Connection granted (SYN=1 server_isn

                                                                                                                                                                              ACK (SYN=0 seq=client_isn+1)

                                                                                                                                                                              ack=client_isn+1)

                                                                                                                                                                              ack=server_isn+1

                                                                                                                                                                              3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                                              Closing a connection

                                                                                                                                                                              client closes socketclientSocketclose()

                                                                                                                                                                              Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                              Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                              client

                                                                                                                                                                              FIN

                                                                                                                                                                              server

                                                                                                                                                                              ACK

                                                                                                                                                                              ACK

                                                                                                                                                                              FIN

                                                                                                                                                                              close

                                                                                                                                                                              close

                                                                                                                                                                              closed

                                                                                                                                                                              tim

                                                                                                                                                                              ed w

                                                                                                                                                                              ait

                                                                                                                                                                              3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                                              Step 3 client receives FIN replies with ACK

                                                                                                                                                                              Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                              Closes down after timed-wait

                                                                                                                                                                              Step 4 server receives ACK Connection closed

                                                                                                                                                                              Note with small modification can handle simultaneous FINs

                                                                                                                                                                              client

                                                                                                                                                                              FIN

                                                                                                                                                                              server

                                                                                                                                                                              ACK

                                                                                                                                                                              ACK

                                                                                                                                                                              FIN

                                                                                                                                                                              closing

                                                                                                                                                                              closing

                                                                                                                                                                              closed

                                                                                                                                                                              tim

                                                                                                                                                                              ed w

                                                                                                                                                                              ait

                                                                                                                                                                              closed

                                                                                                                                                                              3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                              TCP Connection Management (cont)

                                                                                                                                                                              ExampleTCP serverlifecycle

                                                                                                                                                                              Example TCP clientlifecycle

                                                                                                                                                                              3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                              A few special cases

                                                                                                                                                                              Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                              It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                              3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                              Chapter 3 outline

                                                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                              3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                              Principles of Congestion Control

                                                                                                                                                                              Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                              lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                              a top-10 problem

                                                                                                                                                                              3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                              Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                              large delays when congestedmaximum achievable throughput

                                                                                                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                              Causescosts of congestion scenario 2

                                                                                                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                              λin λout=

                                                                                                                                                                              λin λoutgtλ

                                                                                                                                                                              inλout

                                                                                                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                              (c)(a) (b)

                                                                                                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                              λin

                                                                                                                                                                              Q what happens as and increase λ

                                                                                                                                                                              in

                                                                                                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                              Causescosts of congestion scenario 3

                                                                                                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                              Approaches towards congestion control

                                                                                                                                                                              Two broad approaches towards congestion control

                                                                                                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                                                                                                              small exception ndash see next page

                                                                                                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                              sender should use available bandwidth

                                                                                                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                              Chapter 3 outline

                                                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                              Congwin

                                                                                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                              cut CongWin in half after loss event

                                                                                                                                                                              8 Kbytes

                                                                                                                                                                              16 Kbytes

                                                                                                                                                                              24 Kbytes

                                                                                                                                                                              time

                                                                                                                                                                              congestionwindow

                                                                                                                                                                              Long-lived TCP connection

                                                                                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                              TCP Slow Start

                                                                                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                              TCP Slow Start (more)

                                                                                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                              Host A

                                                                                                                                                                              one segment

                                                                                                                                                                              RTT

                                                                                                                                                                              Host B

                                                                                                                                                                              time

                                                                                                                                                                              two segments

                                                                                                                                                                              four segments

                                                                                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                              Summary TCP Congestion Control

                                                                                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                              The Big Picture

                                                                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                              Slow Start (SS)

                                                                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                              CongestionAvoidance (CA)

                                                                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                              Enter slow start

                                                                                                                                                                              Duplicate ACK

                                                                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                              CongWin and Threshold not changed

                                                                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                              TCP throughput

                                                                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                              TCP Futures

                                                                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                              LRTTMSSsdot221

                                                                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                              TCP connection 1

                                                                                                                                                                              bottleneckrouter

                                                                                                                                                                              capacity R

                                                                                                                                                                              TCP connection 2

                                                                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                              R

                                                                                                                                                                              R

                                                                                                                                                                              equal bandwidth share

                                                                                                                                                                              Connection 1 throughput

                                                                                                                                                                              Conn

                                                                                                                                                                              ecti

                                                                                                                                                                              on 2

                                                                                                                                                                              thr

                                                                                                                                                                              ough

                                                                                                                                                                              p ut

                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                                                                              do not want rate throttled by congestion control

                                                                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                              modeling slow start

                                                                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                              Fixed congestion window (1)

                                                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                              latency = 2RTT + OR

                                                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                              Fixed congestion window (2)

                                                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                              Will show that the delay for one object is

                                                                                                                                                                              RS

                                                                                                                                                                              RSRTTP

                                                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                                                              RTT

                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                              requestobject

                                                                                                                                                                              first window= SR

                                                                                                                                                                              second window= 2SR

                                                                                                                                                                              third window= 4SR

                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                              delivered

                                                                                                                                                                              time atclient

                                                                                                                                                                              time atserver

                                                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                              Server idles P=2 times

                                                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                              RS

                                                                                                                                                                              RSRTTPRTT

                                                                                                                                                                              RO

                                                                                                                                                                              RSRTT

                                                                                                                                                                              RSRTT

                                                                                                                                                                              RO

                                                                                                                                                                              idleTimeRTTRO

                                                                                                                                                                              P

                                                                                                                                                                              kP

                                                                                                                                                                              k

                                                                                                                                                                              P

                                                                                                                                                                              pp

                                                                                                                                                                              )12(][2

                                                                                                                                                                              ]2[2

                                                                                                                                                                              2delay

                                                                                                                                                                              1

                                                                                                                                                                              1

                                                                                                                                                                              1

                                                                                                                                                                              minusminus+++=

                                                                                                                                                                              minus+++=

                                                                                                                                                                              ++=

                                                                                                                                                                              minus

                                                                                                                                                                              =

                                                                                                                                                                              =

                                                                                                                                                                              sum

                                                                                                                                                                              sum

                                                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                              RS k =⎥⎦

                                                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                                                              +minus

                                                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                                                              RSk

                                                                                                                                                                              RTT

                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                              requestobject

                                                                                                                                                                              first window= SR

                                                                                                                                                                              second window= 2SR

                                                                                                                                                                              third window= 4SR

                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                              delivered

                                                                                                                                                                              time atclient

                                                                                                                                                                              time atserver

                                                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                              How do we calculate K

                                                                                                                                                                              ⎥⎥⎤

                                                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                                                              +ge=

                                                                                                                                                                              geminus=

                                                                                                                                                                              ge+++=

                                                                                                                                                                              ge+++=minus

                                                                                                                                                                              minus

                                                                                                                                                                              )1(log

                                                                                                                                                                              )1(logmin

                                                                                                                                                                              12min

                                                                                                                                                                              222min222min

                                                                                                                                                                              2

                                                                                                                                                                              2

                                                                                                                                                                              110

                                                                                                                                                                              110

                                                                                                                                                                              SO

                                                                                                                                                                              SOkk

                                                                                                                                                                              SOk

                                                                                                                                                                              SOkOSSSkK

                                                                                                                                                                              k

                                                                                                                                                                              k

                                                                                                                                                                              k

                                                                                                                                                                              L

                                                                                                                                                                              L

                                                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                              02468

                                                                                                                                                                              101214161820

                                                                                                                                                                              28Kbps

                                                                                                                                                                              100Kbps

                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                              non-persistent

                                                                                                                                                                              persistent

                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                                              0

                                                                                                                                                                              10

                                                                                                                                                                              20

                                                                                                                                                                              30

                                                                                                                                                                              40

                                                                                                                                                                              50

                                                                                                                                                                              60

                                                                                                                                                                              70

                                                                                                                                                                              28Kbps

                                                                                                                                                                              100Kbps

                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                              non-persistent

                                                                                                                                                                              persistent

                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                                              UDPTCP

                                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • Transport services and protocols
                                                                                                                                                                              • Transport vs network layer
                                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                              • How demultiplexing works
                                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                              • UDP more
                                                                                                                                                                              • UDP checksum
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                              • Incremental Improvements
                                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                              • rdt21 discussion
                                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                                              • rdt30 sender
                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                              • Performance of rdt30
                                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                                              • Go-Back-N
                                                                                                                                                                              • GBN Sender
                                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                                              • More on receiver
                                                                                                                                                                              • GBN inaction
                                                                                                                                                                              • Selective Repeat
                                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                                              • Selective repeat
                                                                                                                                                                              • Selective repeat in action
                                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                              • More TCP Details
                                                                                                                                                                              • Even More TCP Details
                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                              • Example RTT estimation
                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                                              • TCP sender events
                                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                              • More on Sender Policies
                                                                                                                                                                              • Fast Retransmit
                                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                                              • Technical Issue
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • TCP Connection Management
                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                              • A few special cases
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                                              • TCP AIMD
                                                                                                                                                                              • TCP Slow Start
                                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                                              • The Big Picture
                                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                                              • TCP throughput
                                                                                                                                                                              • TCP Futures
                                                                                                                                                                              • TCP Fairness
                                                                                                                                                                              • Why is TCP fair
                                                                                                                                                                              • Fairness (more)
                                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                                              • HTTP Modeling
                                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                                3 Transport Layer 88Comp 361 Spring 2005

                                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                                Closing a connection

                                                                                                                                                                                client closes socketclientSocketclose()

                                                                                                                                                                                Step 1 client end system sends TCP FIN control segment to server

                                                                                                                                                                                Step 2 server receives FIN replies with ACK Closes connection sends FIN

                                                                                                                                                                                client

                                                                                                                                                                                FIN

                                                                                                                                                                                server

                                                                                                                                                                                ACK

                                                                                                                                                                                ACK

                                                                                                                                                                                FIN

                                                                                                                                                                                close

                                                                                                                                                                                close

                                                                                                                                                                                closed

                                                                                                                                                                                tim

                                                                                                                                                                                ed w

                                                                                                                                                                                ait

                                                                                                                                                                                3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                                Step 3 client receives FIN replies with ACK

                                                                                                                                                                                Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                                Closes down after timed-wait

                                                                                                                                                                                Step 4 server receives ACK Connection closed

                                                                                                                                                                                Note with small modification can handle simultaneous FINs

                                                                                                                                                                                client

                                                                                                                                                                                FIN

                                                                                                                                                                                server

                                                                                                                                                                                ACK

                                                                                                                                                                                ACK

                                                                                                                                                                                FIN

                                                                                                                                                                                closing

                                                                                                                                                                                closing

                                                                                                                                                                                closed

                                                                                                                                                                                tim

                                                                                                                                                                                ed w

                                                                                                                                                                                ait

                                                                                                                                                                                closed

                                                                                                                                                                                3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                                TCP Connection Management (cont)

                                                                                                                                                                                ExampleTCP serverlifecycle

                                                                                                                                                                                Example TCP clientlifecycle

                                                                                                                                                                                3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                                A few special cases

                                                                                                                                                                                Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                                It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                                3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                                Chapter 3 outline

                                                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                                Principles of Congestion Control

                                                                                                                                                                                Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                                lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                                a top-10 problem

                                                                                                                                                                                3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                large delays when congestedmaximum achievable throughput

                                                                                                                                                                                3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                Causescosts of congestion scenario 2

                                                                                                                                                                                one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                λin λout=

                                                                                                                                                                                λin λoutgtλ

                                                                                                                                                                                inλout

                                                                                                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                (c)(a) (b)

                                                                                                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                λin

                                                                                                                                                                                Q what happens as and increase λ

                                                                                                                                                                                in

                                                                                                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                Causescosts of congestion scenario 3

                                                                                                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                Approaches towards congestion control

                                                                                                                                                                                Two broad approaches towards congestion control

                                                                                                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                small exception ndash see next page

                                                                                                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                sender should use available bandwidth

                                                                                                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                Chapter 3 outline

                                                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                Congwin

                                                                                                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                throughput = w MSSRTT Bytessec

                                                                                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                cut CongWin in half after loss event

                                                                                                                                                                                8 Kbytes

                                                                                                                                                                                16 Kbytes

                                                                                                                                                                                24 Kbytes

                                                                                                                                                                                time

                                                                                                                                                                                congestionwindow

                                                                                                                                                                                Long-lived TCP connection

                                                                                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                TCP Slow Start

                                                                                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                TCP Slow Start (more)

                                                                                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                Host A

                                                                                                                                                                                one segment

                                                                                                                                                                                RTT

                                                                                                                                                                                Host B

                                                                                                                                                                                time

                                                                                                                                                                                two segments

                                                                                                                                                                                four segments

                                                                                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                Summary TCP Congestion Control

                                                                                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                The Big Picture

                                                                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                Slow Start (SS)

                                                                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                Enter slow start

                                                                                                                                                                                Duplicate ACK

                                                                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                CongWin and Threshold not changed

                                                                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                TCP throughput

                                                                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                TCP Futures

                                                                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                LRTTMSSsdot221

                                                                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                TCP connection 1

                                                                                                                                                                                bottleneckrouter

                                                                                                                                                                                capacity R

                                                                                                                                                                                TCP connection 2

                                                                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                R

                                                                                                                                                                                R

                                                                                                                                                                                equal bandwidth share

                                                                                                                                                                                Connection 1 throughput

                                                                                                                                                                                Conn

                                                                                                                                                                                ecti

                                                                                                                                                                                on 2

                                                                                                                                                                                thr

                                                                                                                                                                                ough

                                                                                                                                                                                p ut

                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                modeling slow start

                                                                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                                                RS

                                                                                                                                                                                RSRTTP

                                                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                RTT

                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                requestobject

                                                                                                                                                                                first window= SR

                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                delivered

                                                                                                                                                                                time atclient

                                                                                                                                                                                time atserver

                                                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                Server idles P=2 times

                                                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                RS

                                                                                                                                                                                RSRTTPRTT

                                                                                                                                                                                RO

                                                                                                                                                                                RSRTT

                                                                                                                                                                                RSRTT

                                                                                                                                                                                RO

                                                                                                                                                                                idleTimeRTTRO

                                                                                                                                                                                P

                                                                                                                                                                                kP

                                                                                                                                                                                k

                                                                                                                                                                                P

                                                                                                                                                                                pp

                                                                                                                                                                                )12(][2

                                                                                                                                                                                ]2[2

                                                                                                                                                                                2delay

                                                                                                                                                                                1

                                                                                                                                                                                1

                                                                                                                                                                                1

                                                                                                                                                                                minusminus+++=

                                                                                                                                                                                minus+++=

                                                                                                                                                                                ++=

                                                                                                                                                                                minus

                                                                                                                                                                                =

                                                                                                                                                                                =

                                                                                                                                                                                sum

                                                                                                                                                                                sum

                                                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                RS k =⎥⎦

                                                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                                                +minus

                                                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                                                RSk

                                                                                                                                                                                RTT

                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                requestobject

                                                                                                                                                                                first window= SR

                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                delivered

                                                                                                                                                                                time atclient

                                                                                                                                                                                time atserver

                                                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                How do we calculate K

                                                                                                                                                                                ⎥⎥⎤

                                                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                                                +ge=

                                                                                                                                                                                geminus=

                                                                                                                                                                                ge+++=

                                                                                                                                                                                ge+++=minus

                                                                                                                                                                                minus

                                                                                                                                                                                )1(log

                                                                                                                                                                                )1(logmin

                                                                                                                                                                                12min

                                                                                                                                                                                222min222min

                                                                                                                                                                                2

                                                                                                                                                                                2

                                                                                                                                                                                110

                                                                                                                                                                                110

                                                                                                                                                                                SO

                                                                                                                                                                                SOkk

                                                                                                                                                                                SOk

                                                                                                                                                                                SOkOSSSkK

                                                                                                                                                                                k

                                                                                                                                                                                k

                                                                                                                                                                                k

                                                                                                                                                                                L

                                                                                                                                                                                L

                                                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                02468

                                                                                                                                                                                101214161820

                                                                                                                                                                                28Kbps

                                                                                                                                                                                100Kbps

                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                non-persistent

                                                                                                                                                                                persistent

                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                                0

                                                                                                                                                                                10

                                                                                                                                                                                20

                                                                                                                                                                                30

                                                                                                                                                                                40

                                                                                                                                                                                50

                                                                                                                                                                                60

                                                                                                                                                                                70

                                                                                                                                                                                28Kbps

                                                                                                                                                                                100Kbps

                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                non-persistent

                                                                                                                                                                                persistent

                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                                UDPTCP

                                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                • UDP more
                                                                                                                                                                                • UDP checksum
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                                • rdt30 sender
                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                                • Go-Back-N
                                                                                                                                                                                • GBN Sender
                                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                                • More on receiver
                                                                                                                                                                                • GBN inaction
                                                                                                                                                                                • Selective Repeat
                                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                                • Selective repeat
                                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                • More TCP Details
                                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                                • TCP sender events
                                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                                • Technical Issue
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                • A few special cases
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                                • TCP AIMD
                                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                                • The Big Picture
                                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                                • TCP throughput
                                                                                                                                                                                • TCP Futures
                                                                                                                                                                                • TCP Fairness
                                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                                • Fairness (more)
                                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                                  3 Transport Layer 89Comp 361 Spring 2005

                                                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                                                  Step 3 client receives FIN replies with ACK

                                                                                                                                                                                  Enters ldquotimed waitrdquo ndashduring which will respond with ACK to received FINs (that might arrive if ACK gets lost)

                                                                                                                                                                                  Closes down after timed-wait

                                                                                                                                                                                  Step 4 server receives ACK Connection closed

                                                                                                                                                                                  Note with small modification can handle simultaneous FINs

                                                                                                                                                                                  client

                                                                                                                                                                                  FIN

                                                                                                                                                                                  server

                                                                                                                                                                                  ACK

                                                                                                                                                                                  ACK

                                                                                                                                                                                  FIN

                                                                                                                                                                                  closing

                                                                                                                                                                                  closing

                                                                                                                                                                                  closed

                                                                                                                                                                                  tim

                                                                                                                                                                                  ed w

                                                                                                                                                                                  ait

                                                                                                                                                                                  closed

                                                                                                                                                                                  3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                                  TCP Connection Management (cont)

                                                                                                                                                                                  ExampleTCP serverlifecycle

                                                                                                                                                                                  Example TCP clientlifecycle

                                                                                                                                                                                  3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                                  A few special cases

                                                                                                                                                                                  Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                                  It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                                  3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                                  Chapter 3 outline

                                                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                  3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                                  Principles of Congestion Control

                                                                                                                                                                                  Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                                  lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                                  a top-10 problem

                                                                                                                                                                                  3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                  Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                  large delays when congestedmaximum achievable throughput

                                                                                                                                                                                  3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                  Causescosts of congestion scenario 2

                                                                                                                                                                                  one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                  3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                  (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                  (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                  λin λout=

                                                                                                                                                                                  λin λoutgtλ

                                                                                                                                                                                  inλout

                                                                                                                                                                                  ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                  (c)(a) (b)

                                                                                                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                  λin

                                                                                                                                                                                  Q what happens as and increase λ

                                                                                                                                                                                  in

                                                                                                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                  Causescosts of congestion scenario 3

                                                                                                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                  Approaches towards congestion control

                                                                                                                                                                                  Two broad approaches towards congestion control

                                                                                                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                  small exception ndash see next page

                                                                                                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                  sender should use available bandwidth

                                                                                                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                  Chapter 3 outline

                                                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                  Congwin

                                                                                                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                  cut CongWin in half after loss event

                                                                                                                                                                                  8 Kbytes

                                                                                                                                                                                  16 Kbytes

                                                                                                                                                                                  24 Kbytes

                                                                                                                                                                                  time

                                                                                                                                                                                  congestionwindow

                                                                                                                                                                                  Long-lived TCP connection

                                                                                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                  TCP Slow Start

                                                                                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                  TCP Slow Start (more)

                                                                                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                  Host A

                                                                                                                                                                                  one segment

                                                                                                                                                                                  RTT

                                                                                                                                                                                  Host B

                                                                                                                                                                                  time

                                                                                                                                                                                  two segments

                                                                                                                                                                                  four segments

                                                                                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                  Summary TCP Congestion Control

                                                                                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                  The Big Picture

                                                                                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                                  Slow Start (SS)

                                                                                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                                  CongestionAvoidance (CA)

                                                                                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                  Enter slow start

                                                                                                                                                                                  Duplicate ACK

                                                                                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                  CongWin and Threshold not changed

                                                                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                  TCP throughput

                                                                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                  TCP Futures

                                                                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                  LRTTMSSsdot221

                                                                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                  TCP connection 1

                                                                                                                                                                                  bottleneckrouter

                                                                                                                                                                                  capacity R

                                                                                                                                                                                  TCP connection 2

                                                                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                  R

                                                                                                                                                                                  R

                                                                                                                                                                                  equal bandwidth share

                                                                                                                                                                                  Connection 1 throughput

                                                                                                                                                                                  Conn

                                                                                                                                                                                  ecti

                                                                                                                                                                                  on 2

                                                                                                                                                                                  thr

                                                                                                                                                                                  ough

                                                                                                                                                                                  p ut

                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                  modeling slow start

                                                                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                  Fixed congestion window (1)

                                                                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                  latency = 2RTT + OR

                                                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                                                  RS

                                                                                                                                                                                  RSRTTP

                                                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                  RTT

                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                  requestobject

                                                                                                                                                                                  first window= SR

                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                  delivered

                                                                                                                                                                                  time atclient

                                                                                                                                                                                  time atserver

                                                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                  Server idles P=2 times

                                                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                  RS

                                                                                                                                                                                  RSRTTPRTT

                                                                                                                                                                                  RO

                                                                                                                                                                                  RSRTT

                                                                                                                                                                                  RSRTT

                                                                                                                                                                                  RO

                                                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                                                  P

                                                                                                                                                                                  kP

                                                                                                                                                                                  k

                                                                                                                                                                                  P

                                                                                                                                                                                  pp

                                                                                                                                                                                  )12(][2

                                                                                                                                                                                  ]2[2

                                                                                                                                                                                  2delay

                                                                                                                                                                                  1

                                                                                                                                                                                  1

                                                                                                                                                                                  1

                                                                                                                                                                                  minusminus+++=

                                                                                                                                                                                  minus+++=

                                                                                                                                                                                  ++=

                                                                                                                                                                                  minus

                                                                                                                                                                                  =

                                                                                                                                                                                  =

                                                                                                                                                                                  sum

                                                                                                                                                                                  sum

                                                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                                                  +minus

                                                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                                                  RSk

                                                                                                                                                                                  RTT

                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                  requestobject

                                                                                                                                                                                  first window= SR

                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                  delivered

                                                                                                                                                                                  time atclient

                                                                                                                                                                                  time atserver

                                                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                  How do we calculate K

                                                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                                                  +ge=

                                                                                                                                                                                  geminus=

                                                                                                                                                                                  ge+++=

                                                                                                                                                                                  ge+++=minus

                                                                                                                                                                                  minus

                                                                                                                                                                                  )1(log

                                                                                                                                                                                  )1(logmin

                                                                                                                                                                                  12min

                                                                                                                                                                                  222min222min

                                                                                                                                                                                  2

                                                                                                                                                                                  2

                                                                                                                                                                                  110

                                                                                                                                                                                  110

                                                                                                                                                                                  SO

                                                                                                                                                                                  SOkk

                                                                                                                                                                                  SOk

                                                                                                                                                                                  SOkOSSSkK

                                                                                                                                                                                  k

                                                                                                                                                                                  k

                                                                                                                                                                                  k

                                                                                                                                                                                  L

                                                                                                                                                                                  L

                                                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                  02468

                                                                                                                                                                                  101214161820

                                                                                                                                                                                  28Kbps

                                                                                                                                                                                  100Kbps

                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                  non-persistent

                                                                                                                                                                                  persistent

                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                                                  0

                                                                                                                                                                                  10

                                                                                                                                                                                  20

                                                                                                                                                                                  30

                                                                                                                                                                                  40

                                                                                                                                                                                  50

                                                                                                                                                                                  60

                                                                                                                                                                                  70

                                                                                                                                                                                  28Kbps

                                                                                                                                                                                  100Kbps

                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                  non-persistent

                                                                                                                                                                                  persistent

                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                                  UDPTCP

                                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                  • UDP more
                                                                                                                                                                                  • UDP checksum
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                                  • GBN Sender
                                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                                  • More on receiver
                                                                                                                                                                                  • GBN inaction
                                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                                  • Selective repeat
                                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                  • More TCP Details
                                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                                  • TCP sender events
                                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                                  • Technical Issue
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                  • A few special cases
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                                  • The Big Picture
                                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                                  • TCP throughput
                                                                                                                                                                                  • TCP Futures
                                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                                    3 Transport Layer 90Comp 361 Spring 2005

                                                                                                                                                                                    TCP Connection Management (cont)

                                                                                                                                                                                    ExampleTCP serverlifecycle

                                                                                                                                                                                    Example TCP clientlifecycle

                                                                                                                                                                                    3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                                    A few special cases

                                                                                                                                                                                    Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                                    It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                                    3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                                    Chapter 3 outline

                                                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                    3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                                    Principles of Congestion Control

                                                                                                                                                                                    Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                                    lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                                    a top-10 problem

                                                                                                                                                                                    3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                    Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                    large delays when congestedmaximum achievable throughput

                                                                                                                                                                                    3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                    Causescosts of congestion scenario 2

                                                                                                                                                                                    one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                    3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                    (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                    (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                    λin λout=

                                                                                                                                                                                    λin λoutgtλ

                                                                                                                                                                                    inλout

                                                                                                                                                                                    ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                    (c)(a) (b)

                                                                                                                                                                                    3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                    Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                    λin

                                                                                                                                                                                    Q what happens as and increase λ

                                                                                                                                                                                    in

                                                                                                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                    Causescosts of congestion scenario 3

                                                                                                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                    Approaches towards congestion control

                                                                                                                                                                                    Two broad approaches towards congestion control

                                                                                                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                    small exception ndash see next page

                                                                                                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                    sender should use available bandwidth

                                                                                                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                    Chapter 3 outline

                                                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                    Congwin

                                                                                                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                    cut CongWin in half after loss event

                                                                                                                                                                                    8 Kbytes

                                                                                                                                                                                    16 Kbytes

                                                                                                                                                                                    24 Kbytes

                                                                                                                                                                                    time

                                                                                                                                                                                    congestionwindow

                                                                                                                                                                                    Long-lived TCP connection

                                                                                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                    TCP Slow Start

                                                                                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                    TCP Slow Start (more)

                                                                                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                    Host A

                                                                                                                                                                                    one segment

                                                                                                                                                                                    RTT

                                                                                                                                                                                    Host B

                                                                                                                                                                                    time

                                                                                                                                                                                    two segments

                                                                                                                                                                                    four segments

                                                                                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                    Summary TCP Congestion Control

                                                                                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                    The Big Picture

                                                                                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                                    Slow Start (SS)

                                                                                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                                    CongestionAvoidance (CA)

                                                                                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                    Enter slow start

                                                                                                                                                                                    Duplicate ACK

                                                                                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                    CongWin and Threshold not changed

                                                                                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                    TCP throughput

                                                                                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                    TCP Futures

                                                                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                    LRTTMSSsdot221

                                                                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                    TCP connection 1

                                                                                                                                                                                    bottleneckrouter

                                                                                                                                                                                    capacity R

                                                                                                                                                                                    TCP connection 2

                                                                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                    R

                                                                                                                                                                                    R

                                                                                                                                                                                    equal bandwidth share

                                                                                                                                                                                    Connection 1 throughput

                                                                                                                                                                                    Conn

                                                                                                                                                                                    ecti

                                                                                                                                                                                    on 2

                                                                                                                                                                                    thr

                                                                                                                                                                                    ough

                                                                                                                                                                                    p ut

                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                    modeling slow start

                                                                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                    Fixed congestion window (1)

                                                                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                    latency = 2RTT + OR

                                                                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                    Fixed congestion window (2)

                                                                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                                                    RS

                                                                                                                                                                                    RSRTTP

                                                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                    RTT

                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                    requestobject

                                                                                                                                                                                    first window= SR

                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                    delivered

                                                                                                                                                                                    time atclient

                                                                                                                                                                                    time atserver

                                                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                    Server idles P=2 times

                                                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                    RS

                                                                                                                                                                                    RSRTTPRTT

                                                                                                                                                                                    RO

                                                                                                                                                                                    RSRTT

                                                                                                                                                                                    RSRTT

                                                                                                                                                                                    RO

                                                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                                                    P

                                                                                                                                                                                    kP

                                                                                                                                                                                    k

                                                                                                                                                                                    P

                                                                                                                                                                                    pp

                                                                                                                                                                                    )12(][2

                                                                                                                                                                                    ]2[2

                                                                                                                                                                                    2delay

                                                                                                                                                                                    1

                                                                                                                                                                                    1

                                                                                                                                                                                    1

                                                                                                                                                                                    minusminus+++=

                                                                                                                                                                                    minus+++=

                                                                                                                                                                                    ++=

                                                                                                                                                                                    minus

                                                                                                                                                                                    =

                                                                                                                                                                                    =

                                                                                                                                                                                    sum

                                                                                                                                                                                    sum

                                                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                                                    +minus

                                                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                                                    RSk

                                                                                                                                                                                    RTT

                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                    requestobject

                                                                                                                                                                                    first window= SR

                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                    delivered

                                                                                                                                                                                    time atclient

                                                                                                                                                                                    time atserver

                                                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                    How do we calculate K

                                                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                                                    +ge=

                                                                                                                                                                                    geminus=

                                                                                                                                                                                    ge+++=

                                                                                                                                                                                    ge+++=minus

                                                                                                                                                                                    minus

                                                                                                                                                                                    )1(log

                                                                                                                                                                                    )1(logmin

                                                                                                                                                                                    12min

                                                                                                                                                                                    222min222min

                                                                                                                                                                                    2

                                                                                                                                                                                    2

                                                                                                                                                                                    110

                                                                                                                                                                                    110

                                                                                                                                                                                    SO

                                                                                                                                                                                    SOkk

                                                                                                                                                                                    SOk

                                                                                                                                                                                    SOkOSSSkK

                                                                                                                                                                                    k

                                                                                                                                                                                    k

                                                                                                                                                                                    k

                                                                                                                                                                                    L

                                                                                                                                                                                    L

                                                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                    02468

                                                                                                                                                                                    101214161820

                                                                                                                                                                                    28Kbps

                                                                                                                                                                                    100Kbps

                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                    non-persistent

                                                                                                                                                                                    persistent

                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                                                    0

                                                                                                                                                                                    10

                                                                                                                                                                                    20

                                                                                                                                                                                    30

                                                                                                                                                                                    40

                                                                                                                                                                                    50

                                                                                                                                                                                    60

                                                                                                                                                                                    70

                                                                                                                                                                                    28Kbps

                                                                                                                                                                                    100Kbps

                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                    non-persistent

                                                                                                                                                                                    persistent

                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                                                    UDPTCP

                                                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • Transport services and protocols
                                                                                                                                                                                    • Transport vs network layer
                                                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                    • How demultiplexing works
                                                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                    • UDP more
                                                                                                                                                                                    • UDP checksum
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                    • Incremental Improvements
                                                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                    • rdt21 discussion
                                                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                                                    • rdt30 sender
                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                    • Performance of rdt30
                                                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                                                    • Go-Back-N
                                                                                                                                                                                    • GBN Sender
                                                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                                                    • More on receiver
                                                                                                                                                                                    • GBN inaction
                                                                                                                                                                                    • Selective Repeat
                                                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                                                    • Selective repeat
                                                                                                                                                                                    • Selective repeat in action
                                                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                    • More TCP Details
                                                                                                                                                                                    • Even More TCP Details
                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                    • Example RTT estimation
                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                                                    • TCP sender events
                                                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                    • More on Sender Policies
                                                                                                                                                                                    • Fast Retransmit
                                                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                                                    • Technical Issue
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • TCP Connection Management
                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                    • A few special cases
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                                                    • TCP AIMD
                                                                                                                                                                                    • TCP Slow Start
                                                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                                                    • The Big Picture
                                                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                                                    • TCP throughput
                                                                                                                                                                                    • TCP Futures
                                                                                                                                                                                    • TCP Fairness
                                                                                                                                                                                    • Why is TCP fair
                                                                                                                                                                                    • Fairness (more)
                                                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                                                    • HTTP Modeling
                                                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                                                      3 Transport Layer 91Comp 361 Spring 2005

                                                                                                                                                                                      A few special cases

                                                                                                                                                                                      Have not discussed what happens if both client and server decide to close down connection at same time

                                                                                                                                                                                      It is possible that first ACK (from server) and second FIN (also from server) are sent in same segment

                                                                                                                                                                                      3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                                      Chapter 3 outline

                                                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                      3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                                      Principles of Congestion Control

                                                                                                                                                                                      Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                                      lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                                      a top-10 problem

                                                                                                                                                                                      3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                      Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                      large delays when congestedmaximum achievable throughput

                                                                                                                                                                                      3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                      Causescosts of congestion scenario 2

                                                                                                                                                                                      one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                      3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                      (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                      (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                      λin λout=

                                                                                                                                                                                      λin λoutgtλ

                                                                                                                                                                                      inλout

                                                                                                                                                                                      ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                      (c)(a) (b)

                                                                                                                                                                                      3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                      Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                      λin

                                                                                                                                                                                      Q what happens as and increase λ

                                                                                                                                                                                      in

                                                                                                                                                                                      3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                      Causescosts of congestion scenario 3

                                                                                                                                                                                      Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                      Approaches towards congestion control

                                                                                                                                                                                      Two broad approaches towards congestion control

                                                                                                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                      small exception ndash see next page

                                                                                                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                      sender should use available bandwidth

                                                                                                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                      Chapter 3 outline

                                                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                      Congwin

                                                                                                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                      cut CongWin in half after loss event

                                                                                                                                                                                      8 Kbytes

                                                                                                                                                                                      16 Kbytes

                                                                                                                                                                                      24 Kbytes

                                                                                                                                                                                      time

                                                                                                                                                                                      congestionwindow

                                                                                                                                                                                      Long-lived TCP connection

                                                                                                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                      TCP Slow Start

                                                                                                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                      TCP Slow Start (more)

                                                                                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                      Host A

                                                                                                                                                                                      one segment

                                                                                                                                                                                      RTT

                                                                                                                                                                                      Host B

                                                                                                                                                                                      time

                                                                                                                                                                                      two segments

                                                                                                                                                                                      four segments

                                                                                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                      Summary TCP Congestion Control

                                                                                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                      The Big Picture

                                                                                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                                      Slow Start (SS)

                                                                                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                                      CongestionAvoidance (CA)

                                                                                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                      Enter slow start

                                                                                                                                                                                      Duplicate ACK

                                                                                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                      CongWin and Threshold not changed

                                                                                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                      TCP throughput

                                                                                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                      TCP Futures

                                                                                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                      LRTTMSSsdot221

                                                                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                      TCP connection 1

                                                                                                                                                                                      bottleneckrouter

                                                                                                                                                                                      capacity R

                                                                                                                                                                                      TCP connection 2

                                                                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                      R

                                                                                                                                                                                      R

                                                                                                                                                                                      equal bandwidth share

                                                                                                                                                                                      Connection 1 throughput

                                                                                                                                                                                      Conn

                                                                                                                                                                                      ecti

                                                                                                                                                                                      on 2

                                                                                                                                                                                      thr

                                                                                                                                                                                      ough

                                                                                                                                                                                      p ut

                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                      modeling slow start

                                                                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                      Fixed congestion window (1)

                                                                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                      latency = 2RTT + OR

                                                                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                      Fixed congestion window (2)

                                                                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                      Will show that the delay for one object is

                                                                                                                                                                                      RS

                                                                                                                                                                                      RSRTTP

                                                                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                      RTT

                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                      requestobject

                                                                                                                                                                                      first window= SR

                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                      delivered

                                                                                                                                                                                      time atclient

                                                                                                                                                                                      time atserver

                                                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                      Server idles P=2 times

                                                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                      RS

                                                                                                                                                                                      RSRTTPRTT

                                                                                                                                                                                      RO

                                                                                                                                                                                      RSRTT

                                                                                                                                                                                      RSRTT

                                                                                                                                                                                      RO

                                                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                                                      P

                                                                                                                                                                                      kP

                                                                                                                                                                                      k

                                                                                                                                                                                      P

                                                                                                                                                                                      pp

                                                                                                                                                                                      )12(][2

                                                                                                                                                                                      ]2[2

                                                                                                                                                                                      2delay

                                                                                                                                                                                      1

                                                                                                                                                                                      1

                                                                                                                                                                                      1

                                                                                                                                                                                      minusminus+++=

                                                                                                                                                                                      minus+++=

                                                                                                                                                                                      ++=

                                                                                                                                                                                      minus

                                                                                                                                                                                      =

                                                                                                                                                                                      =

                                                                                                                                                                                      sum

                                                                                                                                                                                      sum

                                                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                                                      +minus

                                                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                                                      RSk

                                                                                                                                                                                      RTT

                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                      requestobject

                                                                                                                                                                                      first window= SR

                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                      delivered

                                                                                                                                                                                      time atclient

                                                                                                                                                                                      time atserver

                                                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                      How do we calculate K

                                                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                                                      +ge=

                                                                                                                                                                                      geminus=

                                                                                                                                                                                      ge+++=

                                                                                                                                                                                      ge+++=minus

                                                                                                                                                                                      minus

                                                                                                                                                                                      )1(log

                                                                                                                                                                                      )1(logmin

                                                                                                                                                                                      12min

                                                                                                                                                                                      222min222min

                                                                                                                                                                                      2

                                                                                                                                                                                      2

                                                                                                                                                                                      110

                                                                                                                                                                                      110

                                                                                                                                                                                      SO

                                                                                                                                                                                      SOkk

                                                                                                                                                                                      SOk

                                                                                                                                                                                      SOkOSSSkK

                                                                                                                                                                                      k

                                                                                                                                                                                      k

                                                                                                                                                                                      k

                                                                                                                                                                                      L

                                                                                                                                                                                      L

                                                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                      02468

                                                                                                                                                                                      101214161820

                                                                                                                                                                                      28Kbps

                                                                                                                                                                                      100Kbps

                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                      non-persistent

                                                                                                                                                                                      persistent

                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                                                      0

                                                                                                                                                                                      10

                                                                                                                                                                                      20

                                                                                                                                                                                      30

                                                                                                                                                                                      40

                                                                                                                                                                                      50

                                                                                                                                                                                      60

                                                                                                                                                                                      70

                                                                                                                                                                                      28Kbps

                                                                                                                                                                                      100Kbps

                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                      non-persistent

                                                                                                                                                                                      persistent

                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                                                      UDPTCP

                                                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • Transport services and protocols
                                                                                                                                                                                      • Transport vs network layer
                                                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                      • How demultiplexing works
                                                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                      • UDP more
                                                                                                                                                                                      • UDP checksum
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                      • Incremental Improvements
                                                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                      • rdt21 discussion
                                                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                                                      • rdt30 sender
                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                      • Performance of rdt30
                                                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                                                      • Go-Back-N
                                                                                                                                                                                      • GBN Sender
                                                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                                                      • More on receiver
                                                                                                                                                                                      • GBN inaction
                                                                                                                                                                                      • Selective Repeat
                                                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                                                      • Selective repeat
                                                                                                                                                                                      • Selective repeat in action
                                                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                      • More TCP Details
                                                                                                                                                                                      • Even More TCP Details
                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                      • Example RTT estimation
                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                                                      • TCP sender events
                                                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                      • More on Sender Policies
                                                                                                                                                                                      • Fast Retransmit
                                                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                                                      • Technical Issue
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • TCP Connection Management
                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                      • A few special cases
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                                                      • TCP AIMD
                                                                                                                                                                                      • TCP Slow Start
                                                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                                                      • The Big Picture
                                                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                                                      • TCP throughput
                                                                                                                                                                                      • TCP Futures
                                                                                                                                                                                      • TCP Fairness
                                                                                                                                                                                      • Why is TCP fair
                                                                                                                                                                                      • Fairness (more)
                                                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                                                      • HTTP Modeling
                                                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                                                        3 Transport Layer 92Comp 361 Spring 2005

                                                                                                                                                                                        Chapter 3 outline

                                                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                        3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                                        Principles of Congestion Control

                                                                                                                                                                                        Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                                        lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                                        a top-10 problem

                                                                                                                                                                                        3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                        Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                        large delays when congestedmaximum achievable throughput

                                                                                                                                                                                        3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                        Causescosts of congestion scenario 2

                                                                                                                                                                                        one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                        3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                        (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                        (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                        λin λout=

                                                                                                                                                                                        λin λoutgtλ

                                                                                                                                                                                        inλout

                                                                                                                                                                                        ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                        (c)(a) (b)

                                                                                                                                                                                        3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                        Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                        λin

                                                                                                                                                                                        Q what happens as and increase λ

                                                                                                                                                                                        in

                                                                                                                                                                                        3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                        Causescosts of congestion scenario 3

                                                                                                                                                                                        Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                        3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                        Approaches towards congestion control

                                                                                                                                                                                        Two broad approaches towards congestion control

                                                                                                                                                                                        End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                        Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                        single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                        small exception ndash see next page

                                                                                                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                        sender should use available bandwidth

                                                                                                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                        Chapter 3 outline

                                                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                        Congwin

                                                                                                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                        cut CongWin in half after loss event

                                                                                                                                                                                        8 Kbytes

                                                                                                                                                                                        16 Kbytes

                                                                                                                                                                                        24 Kbytes

                                                                                                                                                                                        time

                                                                                                                                                                                        congestionwindow

                                                                                                                                                                                        Long-lived TCP connection

                                                                                                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                        TCP Slow Start

                                                                                                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                        TCP Slow Start (more)

                                                                                                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                        Host A

                                                                                                                                                                                        one segment

                                                                                                                                                                                        RTT

                                                                                                                                                                                        Host B

                                                                                                                                                                                        time

                                                                                                                                                                                        two segments

                                                                                                                                                                                        four segments

                                                                                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                        Summary TCP Congestion Control

                                                                                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                        The Big Picture

                                                                                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                                        Slow Start (SS)

                                                                                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                                        CongestionAvoidance (CA)

                                                                                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                        Enter slow start

                                                                                                                                                                                        Duplicate ACK

                                                                                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                        CongWin and Threshold not changed

                                                                                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                        TCP throughput

                                                                                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                        TCP Futures

                                                                                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                        LRTTMSSsdot221

                                                                                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                        TCP connection 1

                                                                                                                                                                                        bottleneckrouter

                                                                                                                                                                                        capacity R

                                                                                                                                                                                        TCP connection 2

                                                                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                        R

                                                                                                                                                                                        R

                                                                                                                                                                                        equal bandwidth share

                                                                                                                                                                                        Connection 1 throughput

                                                                                                                                                                                        Conn

                                                                                                                                                                                        ecti

                                                                                                                                                                                        on 2

                                                                                                                                                                                        thr

                                                                                                                                                                                        ough

                                                                                                                                                                                        p ut

                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                        modeling slow start

                                                                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                        Fixed congestion window (1)

                                                                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                        latency = 2RTT + OR

                                                                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                        Fixed congestion window (2)

                                                                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                        Will show that the delay for one object is

                                                                                                                                                                                        RS

                                                                                                                                                                                        RSRTTP

                                                                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                        RTT

                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                        requestobject

                                                                                                                                                                                        first window= SR

                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                        delivered

                                                                                                                                                                                        time atclient

                                                                                                                                                                                        time atserver

                                                                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                        Server idles P=2 times

                                                                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                        RS

                                                                                                                                                                                        RSRTTPRTT

                                                                                                                                                                                        RO

                                                                                                                                                                                        RSRTT

                                                                                                                                                                                        RSRTT

                                                                                                                                                                                        RO

                                                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                                                        P

                                                                                                                                                                                        kP

                                                                                                                                                                                        k

                                                                                                                                                                                        P

                                                                                                                                                                                        pp

                                                                                                                                                                                        )12(][2

                                                                                                                                                                                        ]2[2

                                                                                                                                                                                        2delay

                                                                                                                                                                                        1

                                                                                                                                                                                        1

                                                                                                                                                                                        1

                                                                                                                                                                                        minusminus+++=

                                                                                                                                                                                        minus+++=

                                                                                                                                                                                        ++=

                                                                                                                                                                                        minus

                                                                                                                                                                                        =

                                                                                                                                                                                        =

                                                                                                                                                                                        sum

                                                                                                                                                                                        sum

                                                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                                                        +minus

                                                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                                                        RSk

                                                                                                                                                                                        RTT

                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                        requestobject

                                                                                                                                                                                        first window= SR

                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                        delivered

                                                                                                                                                                                        time atclient

                                                                                                                                                                                        time atserver

                                                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                        How do we calculate K

                                                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                                                        +ge=

                                                                                                                                                                                        geminus=

                                                                                                                                                                                        ge+++=

                                                                                                                                                                                        ge+++=minus

                                                                                                                                                                                        minus

                                                                                                                                                                                        )1(log

                                                                                                                                                                                        )1(logmin

                                                                                                                                                                                        12min

                                                                                                                                                                                        222min222min

                                                                                                                                                                                        2

                                                                                                                                                                                        2

                                                                                                                                                                                        110

                                                                                                                                                                                        110

                                                                                                                                                                                        SO

                                                                                                                                                                                        SOkk

                                                                                                                                                                                        SOk

                                                                                                                                                                                        SOkOSSSkK

                                                                                                                                                                                        k

                                                                                                                                                                                        k

                                                                                                                                                                                        k

                                                                                                                                                                                        L

                                                                                                                                                                                        L

                                                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                        02468

                                                                                                                                                                                        101214161820

                                                                                                                                                                                        28Kbps

                                                                                                                                                                                        100Kbps

                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                        non-persistent

                                                                                                                                                                                        persistent

                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                                                        0

                                                                                                                                                                                        10

                                                                                                                                                                                        20

                                                                                                                                                                                        30

                                                                                                                                                                                        40

                                                                                                                                                                                        50

                                                                                                                                                                                        60

                                                                                                                                                                                        70

                                                                                                                                                                                        28Kbps

                                                                                                                                                                                        100Kbps

                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                        non-persistent

                                                                                                                                                                                        persistent

                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                                                        UDPTCP

                                                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • Transport services and protocols
                                                                                                                                                                                        • Transport vs network layer
                                                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                        • How demultiplexing works
                                                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                        • UDP more
                                                                                                                                                                                        • UDP checksum
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                        • Incremental Improvements
                                                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                        • rdt21 discussion
                                                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                                                        • rdt30 sender
                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                        • Performance of rdt30
                                                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                                                        • Go-Back-N
                                                                                                                                                                                        • GBN Sender
                                                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                                                        • More on receiver
                                                                                                                                                                                        • GBN inaction
                                                                                                                                                                                        • Selective Repeat
                                                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                                                        • Selective repeat
                                                                                                                                                                                        • Selective repeat in action
                                                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                        • More TCP Details
                                                                                                                                                                                        • Even More TCP Details
                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                        • Example RTT estimation
                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                                                        • TCP sender events
                                                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                        • More on Sender Policies
                                                                                                                                                                                        • Fast Retransmit
                                                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                                                        • Technical Issue
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • TCP Connection Management
                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                        • A few special cases
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                                                        • TCP AIMD
                                                                                                                                                                                        • TCP Slow Start
                                                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                                                        • The Big Picture
                                                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                                                        • TCP throughput
                                                                                                                                                                                        • TCP Futures
                                                                                                                                                                                        • TCP Fairness
                                                                                                                                                                                        • Why is TCP fair
                                                                                                                                                                                        • Fairness (more)
                                                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                                                        • HTTP Modeling
                                                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                                                          3 Transport Layer 93Comp 361 Spring 2005

                                                                                                                                                                                          Principles of Congestion Control

                                                                                                                                                                                          Congestioninformally ldquotoo many sources sending too much data too fast for network to handlerdquodifferent from flow controlmanifestations

                                                                                                                                                                                          lost packets (buffer overflow at routers)long delays (queuing in router buffers)

                                                                                                                                                                                          a top-10 problem

                                                                                                                                                                                          3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                          Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                          large delays when congestedmaximum achievable throughput

                                                                                                                                                                                          3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                          Causescosts of congestion scenario 2

                                                                                                                                                                                          one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                          3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                          (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                          (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                          λin λout=

                                                                                                                                                                                          λin λoutgtλ

                                                                                                                                                                                          inλout

                                                                                                                                                                                          ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                          (c)(a) (b)

                                                                                                                                                                                          3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                          Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                          λin

                                                                                                                                                                                          Q what happens as and increase λ

                                                                                                                                                                                          in

                                                                                                                                                                                          3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                          Causescosts of congestion scenario 3

                                                                                                                                                                                          Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                          3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                          Approaches towards congestion control

                                                                                                                                                                                          Two broad approaches towards congestion control

                                                                                                                                                                                          End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                          Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                          single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                          3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                                                          RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                          NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                          RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                          small exception ndash see next page

                                                                                                                                                                                          ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                          sender should use available bandwidth

                                                                                                                                                                                          if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                          Chapter 3 outline

                                                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                          Congwin

                                                                                                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                          cut CongWin in half after loss event

                                                                                                                                                                                          8 Kbytes

                                                                                                                                                                                          16 Kbytes

                                                                                                                                                                                          24 Kbytes

                                                                                                                                                                                          time

                                                                                                                                                                                          congestionwindow

                                                                                                                                                                                          Long-lived TCP connection

                                                                                                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                          TCP Slow Start

                                                                                                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                          TCP Slow Start (more)

                                                                                                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                          Host A

                                                                                                                                                                                          one segment

                                                                                                                                                                                          RTT

                                                                                                                                                                                          Host B

                                                                                                                                                                                          time

                                                                                                                                                                                          two segments

                                                                                                                                                                                          four segments

                                                                                                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                          Summary TCP Congestion Control

                                                                                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                          The Big Picture

                                                                                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                                          Slow Start (SS)

                                                                                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                                          CongestionAvoidance (CA)

                                                                                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                          Enter slow start

                                                                                                                                                                                          Duplicate ACK

                                                                                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                          CongWin and Threshold not changed

                                                                                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                          TCP throughput

                                                                                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                          TCP Futures

                                                                                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                          LRTTMSSsdot221

                                                                                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                          TCP connection 1

                                                                                                                                                                                          bottleneckrouter

                                                                                                                                                                                          capacity R

                                                                                                                                                                                          TCP connection 2

                                                                                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                          R

                                                                                                                                                                                          R

                                                                                                                                                                                          equal bandwidth share

                                                                                                                                                                                          Connection 1 throughput

                                                                                                                                                                                          Conn

                                                                                                                                                                                          ecti

                                                                                                                                                                                          on 2

                                                                                                                                                                                          thr

                                                                                                                                                                                          ough

                                                                                                                                                                                          p ut

                                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                          modeling slow start

                                                                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                          Fixed congestion window (1)

                                                                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                          latency = 2RTT + OR

                                                                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                          Fixed congestion window (2)

                                                                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                          Will show that the delay for one object is

                                                                                                                                                                                          RS

                                                                                                                                                                                          RSRTTP

                                                                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                          RTT

                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                          requestobject

                                                                                                                                                                                          first window= SR

                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                          delivered

                                                                                                                                                                                          time atclient

                                                                                                                                                                                          time atserver

                                                                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                          Server idles P=2 times

                                                                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                          RS

                                                                                                                                                                                          RSRTTPRTT

                                                                                                                                                                                          RO

                                                                                                                                                                                          RSRTT

                                                                                                                                                                                          RSRTT

                                                                                                                                                                                          RO

                                                                                                                                                                                          idleTimeRTTRO

                                                                                                                                                                                          P

                                                                                                                                                                                          kP

                                                                                                                                                                                          k

                                                                                                                                                                                          P

                                                                                                                                                                                          pp

                                                                                                                                                                                          )12(][2

                                                                                                                                                                                          ]2[2

                                                                                                                                                                                          2delay

                                                                                                                                                                                          1

                                                                                                                                                                                          1

                                                                                                                                                                                          1

                                                                                                                                                                                          minusminus+++=

                                                                                                                                                                                          minus+++=

                                                                                                                                                                                          ++=

                                                                                                                                                                                          minus

                                                                                                                                                                                          =

                                                                                                                                                                                          =

                                                                                                                                                                                          sum

                                                                                                                                                                                          sum

                                                                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                          RS k =⎥⎦

                                                                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                                                                          +minus

                                                                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                                                                          RSk

                                                                                                                                                                                          RTT

                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                          requestobject

                                                                                                                                                                                          first window= SR

                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                          delivered

                                                                                                                                                                                          time atclient

                                                                                                                                                                                          time atserver

                                                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                          How do we calculate K

                                                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                                                          +ge=

                                                                                                                                                                                          geminus=

                                                                                                                                                                                          ge+++=

                                                                                                                                                                                          ge+++=minus

                                                                                                                                                                                          minus

                                                                                                                                                                                          )1(log

                                                                                                                                                                                          )1(logmin

                                                                                                                                                                                          12min

                                                                                                                                                                                          222min222min

                                                                                                                                                                                          2

                                                                                                                                                                                          2

                                                                                                                                                                                          110

                                                                                                                                                                                          110

                                                                                                                                                                                          SO

                                                                                                                                                                                          SOkk

                                                                                                                                                                                          SOk

                                                                                                                                                                                          SOkOSSSkK

                                                                                                                                                                                          k

                                                                                                                                                                                          k

                                                                                                                                                                                          k

                                                                                                                                                                                          L

                                                                                                                                                                                          L

                                                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                          02468

                                                                                                                                                                                          101214161820

                                                                                                                                                                                          28Kbps

                                                                                                                                                                                          100Kbps

                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                          non-persistent

                                                                                                                                                                                          persistent

                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                                                          0

                                                                                                                                                                                          10

                                                                                                                                                                                          20

                                                                                                                                                                                          30

                                                                                                                                                                                          40

                                                                                                                                                                                          50

                                                                                                                                                                                          60

                                                                                                                                                                                          70

                                                                                                                                                                                          28Kbps

                                                                                                                                                                                          100Kbps

                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                          non-persistent

                                                                                                                                                                                          persistent

                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                                                          UDPTCP

                                                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • Transport services and protocols
                                                                                                                                                                                          • Transport vs network layer
                                                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                          • How demultiplexing works
                                                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                          • UDP more
                                                                                                                                                                                          • UDP checksum
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                          • Incremental Improvements
                                                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                          • rdt21 discussion
                                                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                                                          • rdt30 sender
                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                          • Performance of rdt30
                                                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                                                          • Go-Back-N
                                                                                                                                                                                          • GBN Sender
                                                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                                                          • More on receiver
                                                                                                                                                                                          • GBN inaction
                                                                                                                                                                                          • Selective Repeat
                                                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                                                          • Selective repeat
                                                                                                                                                                                          • Selective repeat in action
                                                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                          • More TCP Details
                                                                                                                                                                                          • Even More TCP Details
                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                          • Example RTT estimation
                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                                                          • TCP sender events
                                                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                          • More on Sender Policies
                                                                                                                                                                                          • Fast Retransmit
                                                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                                                          • Technical Issue
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • TCP Connection Management
                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                          • A few special cases
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                                                          • TCP AIMD
                                                                                                                                                                                          • TCP Slow Start
                                                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                                                          • The Big Picture
                                                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                                                          • TCP throughput
                                                                                                                                                                                          • TCP Futures
                                                                                                                                                                                          • TCP Fairness
                                                                                                                                                                                          • Why is TCP fair
                                                                                                                                                                                          • Fairness (more)
                                                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                                                          • HTTP Modeling
                                                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                                                            3 Transport Layer 94Comp 361 Spring 2005

                                                                                                                                                                                            Causescosts of congestion scenario 1two senders two receiversone router infinite buffers no retransmissionSend rate 0-C2

                                                                                                                                                                                            large delays when congestedmaximum achievable throughput

                                                                                                                                                                                            3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                            Causescosts of congestion scenario 2

                                                                                                                                                                                            one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                            3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                            (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                            (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                            λin λout=

                                                                                                                                                                                            λin λoutgtλ

                                                                                                                                                                                            inλout

                                                                                                                                                                                            ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                            (c)(a) (b)

                                                                                                                                                                                            3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                            Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                            λin

                                                                                                                                                                                            Q what happens as and increase λ

                                                                                                                                                                                            in

                                                                                                                                                                                            3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                            Causescosts of congestion scenario 3

                                                                                                                                                                                            Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                            3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                            Approaches towards congestion control

                                                                                                                                                                                            Two broad approaches towards congestion control

                                                                                                                                                                                            End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                            Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                            single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                            3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                                                                            RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                            NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                            RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                            small exception ndash see next page

                                                                                                                                                                                            ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                            sender should use available bandwidth

                                                                                                                                                                                            if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                            3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                            Case study ATM ABR congestion control

                                                                                                                                                                                            two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                            EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                            Chapter 3 outline

                                                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                            Congwin

                                                                                                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                            cut CongWin in half after loss event

                                                                                                                                                                                            8 Kbytes

                                                                                                                                                                                            16 Kbytes

                                                                                                                                                                                            24 Kbytes

                                                                                                                                                                                            time

                                                                                                                                                                                            congestionwindow

                                                                                                                                                                                            Long-lived TCP connection

                                                                                                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                            TCP Slow Start

                                                                                                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                            TCP Slow Start (more)

                                                                                                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                            Host A

                                                                                                                                                                                            one segment

                                                                                                                                                                                            RTT

                                                                                                                                                                                            Host B

                                                                                                                                                                                            time

                                                                                                                                                                                            two segments

                                                                                                                                                                                            four segments

                                                                                                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                            Summary TCP Congestion Control

                                                                                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                            The Big Picture

                                                                                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                                            Slow Start (SS)

                                                                                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                                            CongestionAvoidance (CA)

                                                                                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                            Enter slow start

                                                                                                                                                                                            Duplicate ACK

                                                                                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                            CongWin and Threshold not changed

                                                                                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                            TCP throughput

                                                                                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                            TCP Futures

                                                                                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                            LRTTMSSsdot221

                                                                                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                            TCP connection 1

                                                                                                                                                                                            bottleneckrouter

                                                                                                                                                                                            capacity R

                                                                                                                                                                                            TCP connection 2

                                                                                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                            R

                                                                                                                                                                                            R

                                                                                                                                                                                            equal bandwidth share

                                                                                                                                                                                            Connection 1 throughput

                                                                                                                                                                                            Conn

                                                                                                                                                                                            ecti

                                                                                                                                                                                            on 2

                                                                                                                                                                                            thr

                                                                                                                                                                                            ough

                                                                                                                                                                                            p ut

                                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                                                                                            do not want rate throttled by congestion control

                                                                                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                            modeling slow start

                                                                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                            Fixed congestion window (1)

                                                                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                            latency = 2RTT + OR

                                                                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                            Fixed congestion window (2)

                                                                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                            Will show that the delay for one object is

                                                                                                                                                                                            RS

                                                                                                                                                                                            RSRTTP

                                                                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                            RTT

                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                            requestobject

                                                                                                                                                                                            first window= SR

                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                            delivered

                                                                                                                                                                                            time atclient

                                                                                                                                                                                            time atserver

                                                                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                            Server idles P=2 times

                                                                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                            RS

                                                                                                                                                                                            RSRTTPRTT

                                                                                                                                                                                            RO

                                                                                                                                                                                            RSRTT

                                                                                                                                                                                            RSRTT

                                                                                                                                                                                            RO

                                                                                                                                                                                            idleTimeRTTRO

                                                                                                                                                                                            P

                                                                                                                                                                                            kP

                                                                                                                                                                                            k

                                                                                                                                                                                            P

                                                                                                                                                                                            pp

                                                                                                                                                                                            )12(][2

                                                                                                                                                                                            ]2[2

                                                                                                                                                                                            2delay

                                                                                                                                                                                            1

                                                                                                                                                                                            1

                                                                                                                                                                                            1

                                                                                                                                                                                            minusminus+++=

                                                                                                                                                                                            minus+++=

                                                                                                                                                                                            ++=

                                                                                                                                                                                            minus

                                                                                                                                                                                            =

                                                                                                                                                                                            =

                                                                                                                                                                                            sum

                                                                                                                                                                                            sum

                                                                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                            RS k =⎥⎦

                                                                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                                                                            +minus

                                                                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                                                                            RSk

                                                                                                                                                                                            RTT

                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                            requestobject

                                                                                                                                                                                            first window= SR

                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                            delivered

                                                                                                                                                                                            time atclient

                                                                                                                                                                                            time atserver

                                                                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                            How do we calculate K

                                                                                                                                                                                            ⎥⎥⎤

                                                                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                                                                            +ge=

                                                                                                                                                                                            geminus=

                                                                                                                                                                                            ge+++=

                                                                                                                                                                                            ge+++=minus

                                                                                                                                                                                            minus

                                                                                                                                                                                            )1(log

                                                                                                                                                                                            )1(logmin

                                                                                                                                                                                            12min

                                                                                                                                                                                            222min222min

                                                                                                                                                                                            2

                                                                                                                                                                                            2

                                                                                                                                                                                            110

                                                                                                                                                                                            110

                                                                                                                                                                                            SO

                                                                                                                                                                                            SOkk

                                                                                                                                                                                            SOk

                                                                                                                                                                                            SOkOSSSkK

                                                                                                                                                                                            k

                                                                                                                                                                                            k

                                                                                                                                                                                            k

                                                                                                                                                                                            L

                                                                                                                                                                                            L

                                                                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                            02468

                                                                                                                                                                                            101214161820

                                                                                                                                                                                            28Kbps

                                                                                                                                                                                            100Kbps

                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                            non-persistent

                                                                                                                                                                                            persistent

                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                                                            0

                                                                                                                                                                                            10

                                                                                                                                                                                            20

                                                                                                                                                                                            30

                                                                                                                                                                                            40

                                                                                                                                                                                            50

                                                                                                                                                                                            60

                                                                                                                                                                                            70

                                                                                                                                                                                            28Kbps

                                                                                                                                                                                            100Kbps

                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                            non-persistent

                                                                                                                                                                                            persistent

                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                                                            UDPTCP

                                                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • Transport services and protocols
                                                                                                                                                                                            • Transport vs network layer
                                                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                            • How demultiplexing works
                                                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                            • UDP more
                                                                                                                                                                                            • UDP checksum
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                            • Incremental Improvements
                                                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                            • rdt21 discussion
                                                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                                                            • rdt30 sender
                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                            • Performance of rdt30
                                                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                                                            • Go-Back-N
                                                                                                                                                                                            • GBN Sender
                                                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                                                            • More on receiver
                                                                                                                                                                                            • GBN inaction
                                                                                                                                                                                            • Selective Repeat
                                                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                                                            • Selective repeat
                                                                                                                                                                                            • Selective repeat in action
                                                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                            • More TCP Details
                                                                                                                                                                                            • Even More TCP Details
                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                            • Example RTT estimation
                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                                                            • TCP sender events
                                                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                            • More on Sender Policies
                                                                                                                                                                                            • Fast Retransmit
                                                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                                                            • Technical Issue
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • TCP Connection Management
                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                            • A few special cases
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                                                            • TCP AIMD
                                                                                                                                                                                            • TCP Slow Start
                                                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                                                            • The Big Picture
                                                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                                                            • TCP throughput
                                                                                                                                                                                            • TCP Futures
                                                                                                                                                                                            • TCP Fairness
                                                                                                                                                                                            • Why is TCP fair
                                                                                                                                                                                            • Fairness (more)
                                                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                                                            • HTTP Modeling
                                                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                                                              3 Transport Layer 95Comp 361 Spring 2005

                                                                                                                                                                                              Causescosts of congestion scenario 2

                                                                                                                                                                                              one router finite buffers sender retransmission of lost packet

                                                                                                                                                                                              3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                              (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                              (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                              λin λout=

                                                                                                                                                                                              λin λoutgtλ

                                                                                                                                                                                              inλout

                                                                                                                                                                                              ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                              (c)(a) (b)

                                                                                                                                                                                              3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                              Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                              λin

                                                                                                                                                                                              Q what happens as and increase λ

                                                                                                                                                                                              in

                                                                                                                                                                                              3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                              Causescosts of congestion scenario 3

                                                                                                                                                                                              Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                              3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                              Approaches towards congestion control

                                                                                                                                                                                              Two broad approaches towards congestion control

                                                                                                                                                                                              End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                              Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                              single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                              3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                                                                              RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                              NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                              RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                              small exception ndash see next page

                                                                                                                                                                                              ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                              sender should use available bandwidth

                                                                                                                                                                                              if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                              3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                              Case study ATM ABR congestion control

                                                                                                                                                                                              two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                              EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                              3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                              Chapter 3 outline

                                                                                                                                                                                              31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                              35 Connection-oriented transport TCP

                                                                                                                                                                                              segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                              36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                              Congwin

                                                                                                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                              cut CongWin in half after loss event

                                                                                                                                                                                              8 Kbytes

                                                                                                                                                                                              16 Kbytes

                                                                                                                                                                                              24 Kbytes

                                                                                                                                                                                              time

                                                                                                                                                                                              congestionwindow

                                                                                                                                                                                              Long-lived TCP connection

                                                                                                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                              TCP Slow Start

                                                                                                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                              TCP Slow Start (more)

                                                                                                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                              Host A

                                                                                                                                                                                              one segment

                                                                                                                                                                                              RTT

                                                                                                                                                                                              Host B

                                                                                                                                                                                              time

                                                                                                                                                                                              two segments

                                                                                                                                                                                              four segments

                                                                                                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                              Summary TCP Congestion Control

                                                                                                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                              The Big Picture

                                                                                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                                              Slow Start (SS)

                                                                                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                                              CongestionAvoidance (CA)

                                                                                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                              Enter slow start

                                                                                                                                                                                              Duplicate ACK

                                                                                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                              CongWin and Threshold not changed

                                                                                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                              TCP throughput

                                                                                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                              TCP Futures

                                                                                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                              LRTTMSSsdot221

                                                                                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                              TCP connection 1

                                                                                                                                                                                              bottleneckrouter

                                                                                                                                                                                              capacity R

                                                                                                                                                                                              TCP connection 2

                                                                                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                              R

                                                                                                                                                                                              R

                                                                                                                                                                                              equal bandwidth share

                                                                                                                                                                                              Connection 1 throughput

                                                                                                                                                                                              Conn

                                                                                                                                                                                              ecti

                                                                                                                                                                                              on 2

                                                                                                                                                                                              thr

                                                                                                                                                                                              ough

                                                                                                                                                                                              p ut

                                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                                                                                              do not want rate throttled by congestion control

                                                                                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                              modeling slow start

                                                                                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                              Fixed congestion window (1)

                                                                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                              latency = 2RTT + OR

                                                                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                              Fixed congestion window (2)

                                                                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                              Will show that the delay for one object is

                                                                                                                                                                                              RS

                                                                                                                                                                                              RSRTTP

                                                                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                              RTT

                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                              requestobject

                                                                                                                                                                                              first window= SR

                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                              delivered

                                                                                                                                                                                              time atclient

                                                                                                                                                                                              time atserver

                                                                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                              Server idles P=2 times

                                                                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                              RS

                                                                                                                                                                                              RSRTTPRTT

                                                                                                                                                                                              RO

                                                                                                                                                                                              RSRTT

                                                                                                                                                                                              RSRTT

                                                                                                                                                                                              RO

                                                                                                                                                                                              idleTimeRTTRO

                                                                                                                                                                                              P

                                                                                                                                                                                              kP

                                                                                                                                                                                              k

                                                                                                                                                                                              P

                                                                                                                                                                                              pp

                                                                                                                                                                                              )12(][2

                                                                                                                                                                                              ]2[2

                                                                                                                                                                                              2delay

                                                                                                                                                                                              1

                                                                                                                                                                                              1

                                                                                                                                                                                              1

                                                                                                                                                                                              minusminus+++=

                                                                                                                                                                                              minus+++=

                                                                                                                                                                                              ++=

                                                                                                                                                                                              minus

                                                                                                                                                                                              =

                                                                                                                                                                                              =

                                                                                                                                                                                              sum

                                                                                                                                                                                              sum

                                                                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                              RS k =⎥⎦

                                                                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                                                                              +minus

                                                                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                                                                              RSk

                                                                                                                                                                                              RTT

                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                              requestobject

                                                                                                                                                                                              first window= SR

                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                              delivered

                                                                                                                                                                                              time atclient

                                                                                                                                                                                              time atserver

                                                                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                              How do we calculate K

                                                                                                                                                                                              ⎥⎥⎤

                                                                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                                                                              +ge=

                                                                                                                                                                                              geminus=

                                                                                                                                                                                              ge+++=

                                                                                                                                                                                              ge+++=minus

                                                                                                                                                                                              minus

                                                                                                                                                                                              )1(log

                                                                                                                                                                                              )1(logmin

                                                                                                                                                                                              12min

                                                                                                                                                                                              222min222min

                                                                                                                                                                                              2

                                                                                                                                                                                              2

                                                                                                                                                                                              110

                                                                                                                                                                                              110

                                                                                                                                                                                              SO

                                                                                                                                                                                              SOkk

                                                                                                                                                                                              SOk

                                                                                                                                                                                              SOkOSSSkK

                                                                                                                                                                                              k

                                                                                                                                                                                              k

                                                                                                                                                                                              k

                                                                                                                                                                                              L

                                                                                                                                                                                              L

                                                                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                              02468

                                                                                                                                                                                              101214161820

                                                                                                                                                                                              28Kbps

                                                                                                                                                                                              100Kbps

                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                              non-persistent

                                                                                                                                                                                              persistent

                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                                                              0

                                                                                                                                                                                              10

                                                                                                                                                                                              20

                                                                                                                                                                                              30

                                                                                                                                                                                              40

                                                                                                                                                                                              50

                                                                                                                                                                                              60

                                                                                                                                                                                              70

                                                                                                                                                                                              28Kbps

                                                                                                                                                                                              100Kbps

                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                              non-persistent

                                                                                                                                                                                              persistent

                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                                                              UDPTCP

                                                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • Transport services and protocols
                                                                                                                                                                                              • Transport vs network layer
                                                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                              • How demultiplexing works
                                                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                              • UDP more
                                                                                                                                                                                              • UDP checksum
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                              • Incremental Improvements
                                                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                              • rdt21 discussion
                                                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                                                              • rdt30 sender
                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                              • Performance of rdt30
                                                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                                                              • Go-Back-N
                                                                                                                                                                                              • GBN Sender
                                                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                                                              • More on receiver
                                                                                                                                                                                              • GBN inaction
                                                                                                                                                                                              • Selective Repeat
                                                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                                                              • Selective repeat
                                                                                                                                                                                              • Selective repeat in action
                                                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                              • More TCP Details
                                                                                                                                                                                              • Even More TCP Details
                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                              • Example RTT estimation
                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                                                              • TCP sender events
                                                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                              • More on Sender Policies
                                                                                                                                                                                              • Fast Retransmit
                                                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                                                              • Technical Issue
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • TCP Connection Management
                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                              • A few special cases
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                                                              • TCP AIMD
                                                                                                                                                                                              • TCP Slow Start
                                                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                                                              • The Big Picture
                                                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                                                              • TCP throughput
                                                                                                                                                                                              • TCP Futures
                                                                                                                                                                                              • TCP Fairness
                                                                                                                                                                                              • Why is TCP fair
                                                                                                                                                                                              • Fairness (more)
                                                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                                                              • HTTP Modeling
                                                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                                                3 Transport Layer 96Comp 361 Spring 2005

                                                                                                                                                                                                (a) (b) amp (c) always (goodput)(a) Magic transmission only send when therersquos space in buffer(b) ldquoperfectrdquo retransmission only when loss

                                                                                                                                                                                                (c) retransmission of delayed (not lost) packet makes larger (than perfect case) for same

                                                                                                                                                                                                λin λout=

                                                                                                                                                                                                λin λoutgtλ

                                                                                                                                                                                                inλout

                                                                                                                                                                                                ldquocostsrdquo of congestion(b) and (c) more work (retrans) for given ldquogoodputrdquo(c) unneeded retransmissions link carries multiple copies of pkt

                                                                                                                                                                                                (c)(a) (b)

                                                                                                                                                                                                3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                                Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                                λin

                                                                                                                                                                                                Q what happens as and increase λ

                                                                                                                                                                                                in

                                                                                                                                                                                                3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                                Causescosts of congestion scenario 3

                                                                                                                                                                                                Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                                3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                                Approaches towards congestion control

                                                                                                                                                                                                Two broad approaches towards congestion control

                                                                                                                                                                                                End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                                Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                                single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                                3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                                                                RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                                NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                                RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                                small exception ndash see next page

                                                                                                                                                                                                ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                                sender should use available bandwidth

                                                                                                                                                                                                if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                                3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                                Case study ATM ABR congestion control

                                                                                                                                                                                                two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                                EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                                3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                Chapter 3 outline

                                                                                                                                                                                                31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                35 Connection-oriented transport TCP

                                                                                                                                                                                                segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                Congwin

                                                                                                                                                                                                w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                throughput = w MSSRTT Bytessec

                                                                                                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                cut CongWin in half after loss event

                                                                                                                                                                                                8 Kbytes

                                                                                                                                                                                                16 Kbytes

                                                                                                                                                                                                24 Kbytes

                                                                                                                                                                                                time

                                                                                                                                                                                                congestionwindow

                                                                                                                                                                                                Long-lived TCP connection

                                                                                                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                TCP Slow Start

                                                                                                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                TCP Slow Start (more)

                                                                                                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                Host A

                                                                                                                                                                                                one segment

                                                                                                                                                                                                RTT

                                                                                                                                                                                                Host B

                                                                                                                                                                                                time

                                                                                                                                                                                                two segments

                                                                                                                                                                                                four segments

                                                                                                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                Summary TCP Congestion Control

                                                                                                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                The Big Picture

                                                                                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                                Slow Start (SS)

                                                                                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                Enter slow start

                                                                                                                                                                                                Duplicate ACK

                                                                                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                CongWin and Threshold not changed

                                                                                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                TCP throughput

                                                                                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                TCP Futures

                                                                                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                LRTTMSSsdot221

                                                                                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                TCP connection 1

                                                                                                                                                                                                bottleneckrouter

                                                                                                                                                                                                capacity R

                                                                                                                                                                                                TCP connection 2

                                                                                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                R

                                                                                                                                                                                                R

                                                                                                                                                                                                equal bandwidth share

                                                                                                                                                                                                Connection 1 throughput

                                                                                                                                                                                                Conn

                                                                                                                                                                                                ecti

                                                                                                                                                                                                on 2

                                                                                                                                                                                                thr

                                                                                                                                                                                                ough

                                                                                                                                                                                                p ut

                                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                modeling slow start

                                                                                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                                                                RS

                                                                                                                                                                                                RSRTTP

                                                                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                RTT

                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                requestobject

                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                delivered

                                                                                                                                                                                                time atclient

                                                                                                                                                                                                time atserver

                                                                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                Server idles P=2 times

                                                                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                RS

                                                                                                                                                                                                RSRTTPRTT

                                                                                                                                                                                                RO

                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                RO

                                                                                                                                                                                                idleTimeRTTRO

                                                                                                                                                                                                P

                                                                                                                                                                                                kP

                                                                                                                                                                                                k

                                                                                                                                                                                                P

                                                                                                                                                                                                pp

                                                                                                                                                                                                )12(][2

                                                                                                                                                                                                ]2[2

                                                                                                                                                                                                2delay

                                                                                                                                                                                                1

                                                                                                                                                                                                1

                                                                                                                                                                                                1

                                                                                                                                                                                                minusminus+++=

                                                                                                                                                                                                minus+++=

                                                                                                                                                                                                ++=

                                                                                                                                                                                                minus

                                                                                                                                                                                                =

                                                                                                                                                                                                =

                                                                                                                                                                                                sum

                                                                                                                                                                                                sum

                                                                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                RS k =⎥⎦

                                                                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                +minus

                                                                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                                                                RSk

                                                                                                                                                                                                RTT

                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                requestobject

                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                delivered

                                                                                                                                                                                                time atclient

                                                                                                                                                                                                time atserver

                                                                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                How do we calculate K

                                                                                                                                                                                                ⎥⎥⎤

                                                                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                                                                +ge=

                                                                                                                                                                                                geminus=

                                                                                                                                                                                                ge+++=

                                                                                                                                                                                                ge+++=minus

                                                                                                                                                                                                minus

                                                                                                                                                                                                )1(log

                                                                                                                                                                                                )1(logmin

                                                                                                                                                                                                12min

                                                                                                                                                                                                222min222min

                                                                                                                                                                                                2

                                                                                                                                                                                                2

                                                                                                                                                                                                110

                                                                                                                                                                                                110

                                                                                                                                                                                                SO

                                                                                                                                                                                                SOkk

                                                                                                                                                                                                SOk

                                                                                                                                                                                                SOkOSSSkK

                                                                                                                                                                                                k

                                                                                                                                                                                                k

                                                                                                                                                                                                k

                                                                                                                                                                                                L

                                                                                                                                                                                                L

                                                                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                02468

                                                                                                                                                                                                101214161820

                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                persistent

                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                                                0

                                                                                                                                                                                                10

                                                                                                                                                                                                20

                                                                                                                                                                                                30

                                                                                                                                                                                                40

                                                                                                                                                                                                50

                                                                                                                                                                                                60

                                                                                                                                                                                                70

                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                persistent

                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                                                UDPTCP

                                                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                • UDP more
                                                                                                                                                                                                • UDP checksum
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                                                • rdt30 sender
                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                                                • Go-Back-N
                                                                                                                                                                                                • GBN Sender
                                                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                                                • More on receiver
                                                                                                                                                                                                • GBN inaction
                                                                                                                                                                                                • Selective Repeat
                                                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                                                • Selective repeat
                                                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                • More TCP Details
                                                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                                                • TCP sender events
                                                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                                                • Technical Issue
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                • A few special cases
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                                                • TCP AIMD
                                                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                                                • The Big Picture
                                                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                                                • TCP throughput
                                                                                                                                                                                                • TCP Futures
                                                                                                                                                                                                • TCP Fairness
                                                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                                                • Fairness (more)
                                                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                                                  3 Transport Layer 97Comp 361 Spring 2005

                                                                                                                                                                                                  Causescosts of congestion scenario 3four sendersmultihop pathstimeoutretransmit

                                                                                                                                                                                                  λin

                                                                                                                                                                                                  Q what happens as and increase λ

                                                                                                                                                                                                  in

                                                                                                                                                                                                  3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                                  Causescosts of congestion scenario 3

                                                                                                                                                                                                  Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                                  3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                                  Approaches towards congestion control

                                                                                                                                                                                                  Two broad approaches towards congestion control

                                                                                                                                                                                                  End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                                  Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                                  single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                                  3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                                                                  RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                                  NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                                  RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                                  small exception ndash see next page

                                                                                                                                                                                                  ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                                  sender should use available bandwidth

                                                                                                                                                                                                  if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                                  3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                                  Case study ATM ABR congestion control

                                                                                                                                                                                                  two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                                  EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                                  3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                  Chapter 3 outline

                                                                                                                                                                                                  31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                  35 Connection-oriented transport TCP

                                                                                                                                                                                                  segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                  36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                  3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                  Congwin

                                                                                                                                                                                                  w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                  throughput = w MSSRTT Bytessec

                                                                                                                                                                                                  3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                  To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                  Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                  LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                  How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                  three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                  cut CongWin in half after loss event

                                                                                                                                                                                                  8 Kbytes

                                                                                                                                                                                                  16 Kbytes

                                                                                                                                                                                                  24 Kbytes

                                                                                                                                                                                                  time

                                                                                                                                                                                                  congestionwindow

                                                                                                                                                                                                  Long-lived TCP connection

                                                                                                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Slow Start

                                                                                                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Slow Start (more)

                                                                                                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                  Host A

                                                                                                                                                                                                  one segment

                                                                                                                                                                                                  RTT

                                                                                                                                                                                                  Host B

                                                                                                                                                                                                  time

                                                                                                                                                                                                  two segments

                                                                                                                                                                                                  four segments

                                                                                                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                  Summary TCP Congestion Control

                                                                                                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                  The Big Picture

                                                                                                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                                                  Slow Start (SS)

                                                                                                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                                                  CongestionAvoidance (CA)

                                                                                                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                  Enter slow start

                                                                                                                                                                                                  Duplicate ACK

                                                                                                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                  CongWin and Threshold not changed

                                                                                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                  TCP throughput

                                                                                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Futures

                                                                                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                  LRTTMSSsdot221

                                                                                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                  TCP connection 1

                                                                                                                                                                                                  bottleneckrouter

                                                                                                                                                                                                  capacity R

                                                                                                                                                                                                  TCP connection 2

                                                                                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                  R

                                                                                                                                                                                                  R

                                                                                                                                                                                                  equal bandwidth share

                                                                                                                                                                                                  Connection 1 throughput

                                                                                                                                                                                                  Conn

                                                                                                                                                                                                  ecti

                                                                                                                                                                                                  on 2

                                                                                                                                                                                                  thr

                                                                                                                                                                                                  ough

                                                                                                                                                                                                  p ut

                                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                  modeling slow start

                                                                                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                  Fixed congestion window (1)

                                                                                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                  latency = 2RTT + OR

                                                                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                                                                  RS

                                                                                                                                                                                                  RSRTTP

                                                                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                  RTT

                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                  delivered

                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                  Server idles P=2 times

                                                                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                  RS

                                                                                                                                                                                                  RSRTTPRTT

                                                                                                                                                                                                  RO

                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                  RO

                                                                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                                                                  P

                                                                                                                                                                                                  kP

                                                                                                                                                                                                  k

                                                                                                                                                                                                  P

                                                                                                                                                                                                  pp

                                                                                                                                                                                                  )12(][2

                                                                                                                                                                                                  ]2[2

                                                                                                                                                                                                  2delay

                                                                                                                                                                                                  1

                                                                                                                                                                                                  1

                                                                                                                                                                                                  1

                                                                                                                                                                                                  minusminus+++=

                                                                                                                                                                                                  minus+++=

                                                                                                                                                                                                  ++=

                                                                                                                                                                                                  minus

                                                                                                                                                                                                  =

                                                                                                                                                                                                  =

                                                                                                                                                                                                  sum

                                                                                                                                                                                                  sum

                                                                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                  +minus

                                                                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                                                                  RSk

                                                                                                                                                                                                  RTT

                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                  delivered

                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                  How do we calculate K

                                                                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                                                                  +ge=

                                                                                                                                                                                                  geminus=

                                                                                                                                                                                                  ge+++=

                                                                                                                                                                                                  ge+++=minus

                                                                                                                                                                                                  minus

                                                                                                                                                                                                  )1(log

                                                                                                                                                                                                  )1(logmin

                                                                                                                                                                                                  12min

                                                                                                                                                                                                  222min222min

                                                                                                                                                                                                  2

                                                                                                                                                                                                  2

                                                                                                                                                                                                  110

                                                                                                                                                                                                  110

                                                                                                                                                                                                  SO

                                                                                                                                                                                                  SOkk

                                                                                                                                                                                                  SOk

                                                                                                                                                                                                  SOkOSSSkK

                                                                                                                                                                                                  k

                                                                                                                                                                                                  k

                                                                                                                                                                                                  k

                                                                                                                                                                                                  L

                                                                                                                                                                                                  L

                                                                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                  02468

                                                                                                                                                                                                  101214161820

                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                  persistent

                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                                                                  0

                                                                                                                                                                                                  10

                                                                                                                                                                                                  20

                                                                                                                                                                                                  30

                                                                                                                                                                                                  40

                                                                                                                                                                                                  50

                                                                                                                                                                                                  60

                                                                                                                                                                                                  70

                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                  persistent

                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                                                  UDPTCP

                                                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                  • UDP more
                                                                                                                                                                                                  • UDP checksum
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                                                  • GBN Sender
                                                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                                                  • More on receiver
                                                                                                                                                                                                  • GBN inaction
                                                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                                                  • Selective repeat
                                                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                  • More TCP Details
                                                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                                                  • TCP sender events
                                                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                                                  • Technical Issue
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                  • A few special cases
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                                                  • The Big Picture
                                                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                                                  • TCP throughput
                                                                                                                                                                                                  • TCP Futures
                                                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                                                    3 Transport Layer 98Comp 361 Spring 2005

                                                                                                                                                                                                    Causescosts of congestion scenario 3

                                                                                                                                                                                                    Another ldquocostrdquo of congestionwhen packet dropped any ldquoupstream transmission capacity used for that packet was wasted

                                                                                                                                                                                                    3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                                    Approaches towards congestion control

                                                                                                                                                                                                    Two broad approaches towards congestion control

                                                                                                                                                                                                    End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                                    Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                                    single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                                    3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                                                                    RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                                    NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                                    RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                                    small exception ndash see next page

                                                                                                                                                                                                    ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                                    sender should use available bandwidth

                                                                                                                                                                                                    if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                                    3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                                    Case study ATM ABR congestion control

                                                                                                                                                                                                    two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                                    EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                                    3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                    Chapter 3 outline

                                                                                                                                                                                                    31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                    35 Connection-oriented transport TCP

                                                                                                                                                                                                    segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                    36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                    3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                    Congwin

                                                                                                                                                                                                    w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                    throughput = w MSSRTT Bytessec

                                                                                                                                                                                                    3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                    To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                    Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                    LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                    How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                    three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                    3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                    TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                    CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                    cut CongWin in half after loss event

                                                                                                                                                                                                    8 Kbytes

                                                                                                                                                                                                    16 Kbytes

                                                                                                                                                                                                    24 Kbytes

                                                                                                                                                                                                    time

                                                                                                                                                                                                    congestionwindow

                                                                                                                                                                                                    Long-lived TCP connection

                                                                                                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Slow Start

                                                                                                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Slow Start (more)

                                                                                                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                    Host A

                                                                                                                                                                                                    one segment

                                                                                                                                                                                                    RTT

                                                                                                                                                                                                    Host B

                                                                                                                                                                                                    time

                                                                                                                                                                                                    two segments

                                                                                                                                                                                                    four segments

                                                                                                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                    Summary TCP Congestion Control

                                                                                                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                    The Big Picture

                                                                                                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                                                    Slow Start (SS)

                                                                                                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                                                    CongestionAvoidance (CA)

                                                                                                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                    Enter slow start

                                                                                                                                                                                                    Duplicate ACK

                                                                                                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                    CongWin and Threshold not changed

                                                                                                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                    TCP throughput

                                                                                                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Futures

                                                                                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                    LRTTMSSsdot221

                                                                                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                    TCP connection 1

                                                                                                                                                                                                    bottleneckrouter

                                                                                                                                                                                                    capacity R

                                                                                                                                                                                                    TCP connection 2

                                                                                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                    R

                                                                                                                                                                                                    R

                                                                                                                                                                                                    equal bandwidth share

                                                                                                                                                                                                    Connection 1 throughput

                                                                                                                                                                                                    Conn

                                                                                                                                                                                                    ecti

                                                                                                                                                                                                    on 2

                                                                                                                                                                                                    thr

                                                                                                                                                                                                    ough

                                                                                                                                                                                                    p ut

                                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                    modeling slow start

                                                                                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                    Fixed congestion window (1)

                                                                                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                    latency = 2RTT + OR

                                                                                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                    Fixed congestion window (2)

                                                                                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                                                                    RS

                                                                                                                                                                                                    RSRTTP

                                                                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                    RTT

                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                    delivered

                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                    Server idles P=2 times

                                                                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                    RS

                                                                                                                                                                                                    RSRTTPRTT

                                                                                                                                                                                                    RO

                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                    RO

                                                                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                                                                    P

                                                                                                                                                                                                    kP

                                                                                                                                                                                                    k

                                                                                                                                                                                                    P

                                                                                                                                                                                                    pp

                                                                                                                                                                                                    )12(][2

                                                                                                                                                                                                    ]2[2

                                                                                                                                                                                                    2delay

                                                                                                                                                                                                    1

                                                                                                                                                                                                    1

                                                                                                                                                                                                    1

                                                                                                                                                                                                    minusminus+++=

                                                                                                                                                                                                    minus+++=

                                                                                                                                                                                                    ++=

                                                                                                                                                                                                    minus

                                                                                                                                                                                                    =

                                                                                                                                                                                                    =

                                                                                                                                                                                                    sum

                                                                                                                                                                                                    sum

                                                                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                    +minus

                                                                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                                                                    RSk

                                                                                                                                                                                                    RTT

                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                    delivered

                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                    How do we calculate K

                                                                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                                                                    +ge=

                                                                                                                                                                                                    geminus=

                                                                                                                                                                                                    ge+++=

                                                                                                                                                                                                    ge+++=minus

                                                                                                                                                                                                    minus

                                                                                                                                                                                                    )1(log

                                                                                                                                                                                                    )1(logmin

                                                                                                                                                                                                    12min

                                                                                                                                                                                                    222min222min

                                                                                                                                                                                                    2

                                                                                                                                                                                                    2

                                                                                                                                                                                                    110

                                                                                                                                                                                                    110

                                                                                                                                                                                                    SO

                                                                                                                                                                                                    SOkk

                                                                                                                                                                                                    SOk

                                                                                                                                                                                                    SOkOSSSkK

                                                                                                                                                                                                    k

                                                                                                                                                                                                    k

                                                                                                                                                                                                    k

                                                                                                                                                                                                    L

                                                                                                                                                                                                    L

                                                                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                    02468

                                                                                                                                                                                                    101214161820

                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                    persistent

                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                                                                    0

                                                                                                                                                                                                    10

                                                                                                                                                                                                    20

                                                                                                                                                                                                    30

                                                                                                                                                                                                    40

                                                                                                                                                                                                    50

                                                                                                                                                                                                    60

                                                                                                                                                                                                    70

                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                    persistent

                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                                                                    UDPTCP

                                                                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • Transport services and protocols
                                                                                                                                                                                                    • Transport vs network layer
                                                                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                    • How demultiplexing works
                                                                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                    • UDP more
                                                                                                                                                                                                    • UDP checksum
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                    • Incremental Improvements
                                                                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                    • rdt21 discussion
                                                                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                                                                    • rdt30 sender
                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                    • Performance of rdt30
                                                                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                                                                    • Go-Back-N
                                                                                                                                                                                                    • GBN Sender
                                                                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                                                                    • More on receiver
                                                                                                                                                                                                    • GBN inaction
                                                                                                                                                                                                    • Selective Repeat
                                                                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                                                                    • Selective repeat
                                                                                                                                                                                                    • Selective repeat in action
                                                                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                    • More TCP Details
                                                                                                                                                                                                    • Even More TCP Details
                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                    • Example RTT estimation
                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                                                                    • TCP sender events
                                                                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                    • More on Sender Policies
                                                                                                                                                                                                    • Fast Retransmit
                                                                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                                                                    • Technical Issue
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • TCP Connection Management
                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                    • A few special cases
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                                                                    • TCP AIMD
                                                                                                                                                                                                    • TCP Slow Start
                                                                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                                                                    • The Big Picture
                                                                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                                                                    • TCP throughput
                                                                                                                                                                                                    • TCP Futures
                                                                                                                                                                                                    • TCP Fairness
                                                                                                                                                                                                    • Why is TCP fair
                                                                                                                                                                                                    • Fairness (more)
                                                                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                                                                    • HTTP Modeling
                                                                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                                                                      3 Transport Layer 99Comp 361 Spring 2005

                                                                                                                                                                                                      Approaches towards congestion control

                                                                                                                                                                                                      Two broad approaches towards congestion control

                                                                                                                                                                                                      End-end congestion controlno explicit feedback from networkcongestion inferred from end-system observed loss delayapproach taken by TCP

                                                                                                                                                                                                      Network-assisted congestion controlrouters provide feedback to end systems

                                                                                                                                                                                                      single bit indicating congestion (SNA DECbit TCPIP ECN ATM)explicit rate sender should send at

                                                                                                                                                                                                      3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                                                                      RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                                      NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                                      RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                                      small exception ndash see next page

                                                                                                                                                                                                      ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                                      sender should use available bandwidth

                                                                                                                                                                                                      if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                                      3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                                      Case study ATM ABR congestion control

                                                                                                                                                                                                      two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                                      EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                                      3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                      Chapter 3 outline

                                                                                                                                                                                                      31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                      35 Connection-oriented transport TCP

                                                                                                                                                                                                      segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                      36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                      3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                      Congwin

                                                                                                                                                                                                      w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                      throughput = w MSSRTT Bytessec

                                                                                                                                                                                                      3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                      To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                      Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                      LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                      How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                      three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                      3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                      TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                      CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                      cut CongWin in half after loss event

                                                                                                                                                                                                      8 Kbytes

                                                                                                                                                                                                      16 Kbytes

                                                                                                                                                                                                      24 Kbytes

                                                                                                                                                                                                      time

                                                                                                                                                                                                      congestionwindow

                                                                                                                                                                                                      Long-lived TCP connection

                                                                                                                                                                                                      3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Slow Start

                                                                                                                                                                                                      When connection begins CongWin = 1 MSS

                                                                                                                                                                                                      Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                      available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                      desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                      When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Slow Start (more)

                                                                                                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                      Host A

                                                                                                                                                                                                      one segment

                                                                                                                                                                                                      RTT

                                                                                                                                                                                                      Host B

                                                                                                                                                                                                      time

                                                                                                                                                                                                      two segments

                                                                                                                                                                                                      four segments

                                                                                                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                      Summary TCP Congestion Control

                                                                                                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                      The Big Picture

                                                                                                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                                                      Slow Start (SS)

                                                                                                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                                                      CongestionAvoidance (CA)

                                                                                                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                      Enter slow start

                                                                                                                                                                                                      Duplicate ACK

                                                                                                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                      CongWin and Threshold not changed

                                                                                                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                      TCP throughput

                                                                                                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Futures

                                                                                                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                      LRTTMSSsdot221

                                                                                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                      TCP connection 1

                                                                                                                                                                                                      bottleneckrouter

                                                                                                                                                                                                      capacity R

                                                                                                                                                                                                      TCP connection 2

                                                                                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                      R

                                                                                                                                                                                                      R

                                                                                                                                                                                                      equal bandwidth share

                                                                                                                                                                                                      Connection 1 throughput

                                                                                                                                                                                                      Conn

                                                                                                                                                                                                      ecti

                                                                                                                                                                                                      on 2

                                                                                                                                                                                                      thr

                                                                                                                                                                                                      ough

                                                                                                                                                                                                      p ut

                                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                      modeling slow start

                                                                                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                      Fixed congestion window (1)

                                                                                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                      latency = 2RTT + OR

                                                                                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                      Fixed congestion window (2)

                                                                                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                      Will show that the delay for one object is

                                                                                                                                                                                                      RS

                                                                                                                                                                                                      RSRTTP

                                                                                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                      RTT

                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                      delivered

                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                      Server idles P=2 times

                                                                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                      RS

                                                                                                                                                                                                      RSRTTPRTT

                                                                                                                                                                                                      RO

                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                      RO

                                                                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                                                                      P

                                                                                                                                                                                                      kP

                                                                                                                                                                                                      k

                                                                                                                                                                                                      P

                                                                                                                                                                                                      pp

                                                                                                                                                                                                      )12(][2

                                                                                                                                                                                                      ]2[2

                                                                                                                                                                                                      2delay

                                                                                                                                                                                                      1

                                                                                                                                                                                                      1

                                                                                                                                                                                                      1

                                                                                                                                                                                                      minusminus+++=

                                                                                                                                                                                                      minus+++=

                                                                                                                                                                                                      ++=

                                                                                                                                                                                                      minus

                                                                                                                                                                                                      =

                                                                                                                                                                                                      =

                                                                                                                                                                                                      sum

                                                                                                                                                                                                      sum

                                                                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                      +minus

                                                                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                                                                      RSk

                                                                                                                                                                                                      RTT

                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                      delivered

                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                      How do we calculate K

                                                                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                                                                      +ge=

                                                                                                                                                                                                      geminus=

                                                                                                                                                                                                      ge+++=

                                                                                                                                                                                                      ge+++=minus

                                                                                                                                                                                                      minus

                                                                                                                                                                                                      )1(log

                                                                                                                                                                                                      )1(logmin

                                                                                                                                                                                                      12min

                                                                                                                                                                                                      222min222min

                                                                                                                                                                                                      2

                                                                                                                                                                                                      2

                                                                                                                                                                                                      110

                                                                                                                                                                                                      110

                                                                                                                                                                                                      SO

                                                                                                                                                                                                      SOkk

                                                                                                                                                                                                      SOk

                                                                                                                                                                                                      SOkOSSSkK

                                                                                                                                                                                                      k

                                                                                                                                                                                                      k

                                                                                                                                                                                                      k

                                                                                                                                                                                                      L

                                                                                                                                                                                                      L

                                                                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                      02468

                                                                                                                                                                                                      101214161820

                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                      persistent

                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                                                                      0

                                                                                                                                                                                                      10

                                                                                                                                                                                                      20

                                                                                                                                                                                                      30

                                                                                                                                                                                                      40

                                                                                                                                                                                                      50

                                                                                                                                                                                                      60

                                                                                                                                                                                                      70

                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                      persistent

                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                                                                      UDPTCP

                                                                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • Transport services and protocols
                                                                                                                                                                                                      • Transport vs network layer
                                                                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                      • How demultiplexing works
                                                                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                      • UDP more
                                                                                                                                                                                                      • UDP checksum
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                      • Incremental Improvements
                                                                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                      • rdt21 discussion
                                                                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                                                                      • rdt30 sender
                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                      • Performance of rdt30
                                                                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                                                                      • Go-Back-N
                                                                                                                                                                                                      • GBN Sender
                                                                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                                                                      • More on receiver
                                                                                                                                                                                                      • GBN inaction
                                                                                                                                                                                                      • Selective Repeat
                                                                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                                                                      • Selective repeat
                                                                                                                                                                                                      • Selective repeat in action
                                                                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                      • More TCP Details
                                                                                                                                                                                                      • Even More TCP Details
                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                      • Example RTT estimation
                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                                                                      • TCP sender events
                                                                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                      • More on Sender Policies
                                                                                                                                                                                                      • Fast Retransmit
                                                                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                                                                      • Technical Issue
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • TCP Connection Management
                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                      • A few special cases
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                                                                      • TCP AIMD
                                                                                                                                                                                                      • TCP Slow Start
                                                                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                                                                      • The Big Picture
                                                                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                                                                      • TCP throughput
                                                                                                                                                                                                      • TCP Futures
                                                                                                                                                                                                      • TCP Fairness
                                                                                                                                                                                                      • Why is TCP fair
                                                                                                                                                                                                      • Fairness (more)
                                                                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                                                                      • HTTP Modeling
                                                                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                                                                        3 Transport Layer 100Comp 361 Spring 2005

                                                                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                                                                        RM (resource management) cellssent by sender interspersed with data cellsbits in RM cell set by switches (ldquonetwork-assistedrdquo)

                                                                                                                                                                                                        NI bit no increase in rate (mild congestion)CI bit severe congestion indicator

                                                                                                                                                                                                        RM cells returned to sender by receiver with bits intact

                                                                                                                                                                                                        small exception ndash see next page

                                                                                                                                                                                                        ABR available bit rateldquoelastic servicerdquoif senderrsquos path ldquounderloadedrdquo

                                                                                                                                                                                                        sender should use available bandwidth

                                                                                                                                                                                                        if senderrsquos path congested sender throttled to minimum guaranteed rate

                                                                                                                                                                                                        3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                                        Case study ATM ABR congestion control

                                                                                                                                                                                                        two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                                        EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                                        3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                        Chapter 3 outline

                                                                                                                                                                                                        31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                        35 Connection-oriented transport TCP

                                                                                                                                                                                                        segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                        36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                        3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                        Congwin

                                                                                                                                                                                                        w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                        throughput = w MSSRTT Bytessec

                                                                                                                                                                                                        3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                        To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                        Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                        LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                        How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                        three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                        3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                        TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                        CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                        cut CongWin in half after loss event

                                                                                                                                                                                                        8 Kbytes

                                                                                                                                                                                                        16 Kbytes

                                                                                                                                                                                                        24 Kbytes

                                                                                                                                                                                                        time

                                                                                                                                                                                                        congestionwindow

                                                                                                                                                                                                        Long-lived TCP connection

                                                                                                                                                                                                        3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Slow Start

                                                                                                                                                                                                        When connection begins CongWin = 1 MSS

                                                                                                                                                                                                        Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                        available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                        desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                        When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                        3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Slow Start (more)

                                                                                                                                                                                                        When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                        double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                        Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                        Host A

                                                                                                                                                                                                        one segment

                                                                                                                                                                                                        RTT

                                                                                                                                                                                                        Host B

                                                                                                                                                                                                        time

                                                                                                                                                                                                        two segments

                                                                                                                                                                                                        four segments

                                                                                                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                        Summary TCP Congestion Control

                                                                                                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                        The Big Picture

                                                                                                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                                                        Slow Start (SS)

                                                                                                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                                                        CongestionAvoidance (CA)

                                                                                                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                        Enter slow start

                                                                                                                                                                                                        Duplicate ACK

                                                                                                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                        CongWin and Threshold not changed

                                                                                                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                        TCP throughput

                                                                                                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Futures

                                                                                                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                        LRTTMSSsdot221

                                                                                                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                        TCP connection 1

                                                                                                                                                                                                        bottleneckrouter

                                                                                                                                                                                                        capacity R

                                                                                                                                                                                                        TCP connection 2

                                                                                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                        R

                                                                                                                                                                                                        R

                                                                                                                                                                                                        equal bandwidth share

                                                                                                                                                                                                        Connection 1 throughput

                                                                                                                                                                                                        Conn

                                                                                                                                                                                                        ecti

                                                                                                                                                                                                        on 2

                                                                                                                                                                                                        thr

                                                                                                                                                                                                        ough

                                                                                                                                                                                                        p ut

                                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                        modeling slow start

                                                                                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                        Fixed congestion window (1)

                                                                                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                        latency = 2RTT + OR

                                                                                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                        Fixed congestion window (2)

                                                                                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                        Will show that the delay for one object is

                                                                                                                                                                                                        RS

                                                                                                                                                                                                        RSRTTP

                                                                                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                        RTT

                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                        delivered

                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                        Server idles P=2 times

                                                                                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                        RS

                                                                                                                                                                                                        RSRTTPRTT

                                                                                                                                                                                                        RO

                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                        RO

                                                                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                                                                        P

                                                                                                                                                                                                        kP

                                                                                                                                                                                                        k

                                                                                                                                                                                                        P

                                                                                                                                                                                                        pp

                                                                                                                                                                                                        )12(][2

                                                                                                                                                                                                        ]2[2

                                                                                                                                                                                                        2delay

                                                                                                                                                                                                        1

                                                                                                                                                                                                        1

                                                                                                                                                                                                        1

                                                                                                                                                                                                        minusminus+++=

                                                                                                                                                                                                        minus+++=

                                                                                                                                                                                                        ++=

                                                                                                                                                                                                        minus

                                                                                                                                                                                                        =

                                                                                                                                                                                                        =

                                                                                                                                                                                                        sum

                                                                                                                                                                                                        sum

                                                                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                        +minus

                                                                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                                                                        RSk

                                                                                                                                                                                                        RTT

                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                        delivered

                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                        How do we calculate K

                                                                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                                                                        +ge=

                                                                                                                                                                                                        geminus=

                                                                                                                                                                                                        ge+++=

                                                                                                                                                                                                        ge+++=minus

                                                                                                                                                                                                        minus

                                                                                                                                                                                                        )1(log

                                                                                                                                                                                                        )1(logmin

                                                                                                                                                                                                        12min

                                                                                                                                                                                                        222min222min

                                                                                                                                                                                                        2

                                                                                                                                                                                                        2

                                                                                                                                                                                                        110

                                                                                                                                                                                                        110

                                                                                                                                                                                                        SO

                                                                                                                                                                                                        SOkk

                                                                                                                                                                                                        SOk

                                                                                                                                                                                                        SOkOSSSkK

                                                                                                                                                                                                        k

                                                                                                                                                                                                        k

                                                                                                                                                                                                        k

                                                                                                                                                                                                        L

                                                                                                                                                                                                        L

                                                                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                        02468

                                                                                                                                                                                                        101214161820

                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                        persistent

                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                                                                        0

                                                                                                                                                                                                        10

                                                                                                                                                                                                        20

                                                                                                                                                                                                        30

                                                                                                                                                                                                        40

                                                                                                                                                                                                        50

                                                                                                                                                                                                        60

                                                                                                                                                                                                        70

                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                        persistent

                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                                                                        UDPTCP

                                                                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • Transport services and protocols
                                                                                                                                                                                                        • Transport vs network layer
                                                                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                        • How demultiplexing works
                                                                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                        • UDP more
                                                                                                                                                                                                        • UDP checksum
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                        • Incremental Improvements
                                                                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                        • rdt21 discussion
                                                                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                                                                        • rdt30 sender
                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                        • Performance of rdt30
                                                                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                                                                        • Go-Back-N
                                                                                                                                                                                                        • GBN Sender
                                                                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                                                                        • More on receiver
                                                                                                                                                                                                        • GBN inaction
                                                                                                                                                                                                        • Selective Repeat
                                                                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                                                                        • Selective repeat
                                                                                                                                                                                                        • Selective repeat in action
                                                                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                        • More TCP Details
                                                                                                                                                                                                        • Even More TCP Details
                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                        • Example RTT estimation
                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                                                                        • TCP sender events
                                                                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                        • More on Sender Policies
                                                                                                                                                                                                        • Fast Retransmit
                                                                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                                                                        • Technical Issue
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • TCP Connection Management
                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                        • A few special cases
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                                                                        • TCP AIMD
                                                                                                                                                                                                        • TCP Slow Start
                                                                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                                                                        • The Big Picture
                                                                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                                                                        • TCP throughput
                                                                                                                                                                                                        • TCP Futures
                                                                                                                                                                                                        • TCP Fairness
                                                                                                                                                                                                        • Why is TCP fair
                                                                                                                                                                                                        • Fairness (more)
                                                                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                                                                        • HTTP Modeling
                                                                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                                                                          3 Transport Layer 101Comp 361 Spring 2005

                                                                                                                                                                                                          Case study ATM ABR congestion control

                                                                                                                                                                                                          two-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsenderrsquos send rate thus minimum supportable rate on path

                                                                                                                                                                                                          EFCI bit in data cells set to 1 by congested switchSignals congestionif data cell preceding RM cell has EFCI=1 destination sets CI bit=1 before returning RM cell to source

                                                                                                                                                                                                          3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                          Chapter 3 outline

                                                                                                                                                                                                          31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                          35 Connection-oriented transport TCP

                                                                                                                                                                                                          segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                          36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                          3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                          Congwin

                                                                                                                                                                                                          w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                          throughput = w MSSRTT Bytessec

                                                                                                                                                                                                          3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                          To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                          Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                          LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                          How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                          three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                          3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                          TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                          CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                          cut CongWin in half after loss event

                                                                                                                                                                                                          8 Kbytes

                                                                                                                                                                                                          16 Kbytes

                                                                                                                                                                                                          24 Kbytes

                                                                                                                                                                                                          time

                                                                                                                                                                                                          congestionwindow

                                                                                                                                                                                                          Long-lived TCP connection

                                                                                                                                                                                                          3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Slow Start

                                                                                                                                                                                                          When connection begins CongWin = 1 MSS

                                                                                                                                                                                                          Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                          available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                          desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                          When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                          3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Slow Start (more)

                                                                                                                                                                                                          When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                          double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                          Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                          Host A

                                                                                                                                                                                                          one segment

                                                                                                                                                                                                          RTT

                                                                                                                                                                                                          Host B

                                                                                                                                                                                                          time

                                                                                                                                                                                                          two segments

                                                                                                                                                                                                          four segments

                                                                                                                                                                                                          3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                          So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                          Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                          bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                          bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                          Summary TCP Congestion Control

                                                                                                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                          The Big Picture

                                                                                                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                                                          Slow Start (SS)

                                                                                                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                                                          CongestionAvoidance (CA)

                                                                                                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                          Enter slow start

                                                                                                                                                                                                          Duplicate ACK

                                                                                                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                          CongWin and Threshold not changed

                                                                                                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                          TCP throughput

                                                                                                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Futures

                                                                                                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                          LRTTMSSsdot221

                                                                                                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                          TCP connection 1

                                                                                                                                                                                                          bottleneckrouter

                                                                                                                                                                                                          capacity R

                                                                                                                                                                                                          TCP connection 2

                                                                                                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                          R

                                                                                                                                                                                                          R

                                                                                                                                                                                                          equal bandwidth share

                                                                                                                                                                                                          Connection 1 throughput

                                                                                                                                                                                                          Conn

                                                                                                                                                                                                          ecti

                                                                                                                                                                                                          on 2

                                                                                                                                                                                                          thr

                                                                                                                                                                                                          ough

                                                                                                                                                                                                          p ut

                                                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                          modeling slow start

                                                                                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                          Fixed congestion window (1)

                                                                                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                          latency = 2RTT + OR

                                                                                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                          Fixed congestion window (2)

                                                                                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                          Will show that the delay for one object is

                                                                                                                                                                                                          RS

                                                                                                                                                                                                          RSRTTP

                                                                                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                          RTT

                                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                                          requestobject

                                                                                                                                                                                                          first window= SR

                                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                                          delivered

                                                                                                                                                                                                          time atclient

                                                                                                                                                                                                          time atserver

                                                                                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                          Server idles P=2 times

                                                                                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                          RS

                                                                                                                                                                                                          RSRTTPRTT

                                                                                                                                                                                                          RO

                                                                                                                                                                                                          RSRTT

                                                                                                                                                                                                          RSRTT

                                                                                                                                                                                                          RO

                                                                                                                                                                                                          idleTimeRTTRO

                                                                                                                                                                                                          P

                                                                                                                                                                                                          kP

                                                                                                                                                                                                          k

                                                                                                                                                                                                          P

                                                                                                                                                                                                          pp

                                                                                                                                                                                                          )12(][2

                                                                                                                                                                                                          ]2[2

                                                                                                                                                                                                          2delay

                                                                                                                                                                                                          1

                                                                                                                                                                                                          1

                                                                                                                                                                                                          1

                                                                                                                                                                                                          minusminus+++=

                                                                                                                                                                                                          minus+++=

                                                                                                                                                                                                          ++=

                                                                                                                                                                                                          minus

                                                                                                                                                                                                          =

                                                                                                                                                                                                          =

                                                                                                                                                                                                          sum

                                                                                                                                                                                                          sum

                                                                                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                          RS k =⎥⎦

                                                                                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                          +minus

                                                                                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                                                                                          RSk

                                                                                                                                                                                                          RTT

                                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                                          requestobject

                                                                                                                                                                                                          first window= SR

                                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                                          delivered

                                                                                                                                                                                                          time atclient

                                                                                                                                                                                                          time atserver

                                                                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                          How do we calculate K

                                                                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                                                                          +ge=

                                                                                                                                                                                                          geminus=

                                                                                                                                                                                                          ge+++=

                                                                                                                                                                                                          ge+++=minus

                                                                                                                                                                                                          minus

                                                                                                                                                                                                          )1(log

                                                                                                                                                                                                          )1(logmin

                                                                                                                                                                                                          12min

                                                                                                                                                                                                          222min222min

                                                                                                                                                                                                          2

                                                                                                                                                                                                          2

                                                                                                                                                                                                          110

                                                                                                                                                                                                          110

                                                                                                                                                                                                          SO

                                                                                                                                                                                                          SOkk

                                                                                                                                                                                                          SOk

                                                                                                                                                                                                          SOkOSSSkK

                                                                                                                                                                                                          k

                                                                                                                                                                                                          k

                                                                                                                                                                                                          k

                                                                                                                                                                                                          L

                                                                                                                                                                                                          L

                                                                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                          02468

                                                                                                                                                                                                          101214161820

                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                          persistent

                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                                                                          0

                                                                                                                                                                                                          10

                                                                                                                                                                                                          20

                                                                                                                                                                                                          30

                                                                                                                                                                                                          40

                                                                                                                                                                                                          50

                                                                                                                                                                                                          60

                                                                                                                                                                                                          70

                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                          persistent

                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                                                                          UDPTCP

                                                                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • Transport services and protocols
                                                                                                                                                                                                          • Transport vs network layer
                                                                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                          • How demultiplexing works
                                                                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                          • UDP more
                                                                                                                                                                                                          • UDP checksum
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                          • Incremental Improvements
                                                                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                          • rdt21 discussion
                                                                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                                                                          • rdt30 sender
                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                          • Performance of rdt30
                                                                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                                                                          • Go-Back-N
                                                                                                                                                                                                          • GBN Sender
                                                                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                                                                          • More on receiver
                                                                                                                                                                                                          • GBN inaction
                                                                                                                                                                                                          • Selective Repeat
                                                                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                                                                          • Selective repeat
                                                                                                                                                                                                          • Selective repeat in action
                                                                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                          • More TCP Details
                                                                                                                                                                                                          • Even More TCP Details
                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                          • Example RTT estimation
                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                                                                          • TCP sender events
                                                                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                          • More on Sender Policies
                                                                                                                                                                                                          • Fast Retransmit
                                                                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                                                                          • Technical Issue
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • TCP Connection Management
                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                          • A few special cases
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                                                                          • TCP AIMD
                                                                                                                                                                                                          • TCP Slow Start
                                                                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                                                                          • The Big Picture
                                                                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                                                                          • TCP throughput
                                                                                                                                                                                                          • TCP Futures
                                                                                                                                                                                                          • TCP Fairness
                                                                                                                                                                                                          • Why is TCP fair
                                                                                                                                                                                                          • Fairness (more)
                                                                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                                                                          • HTTP Modeling
                                                                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                                                                            3 Transport Layer 102Comp 361 Spring 2005

                                                                                                                                                                                                            Chapter 3 outline

                                                                                                                                                                                                            31 Transport-layer services32 Multiplexing and demultiplexing33 Connectionless transport UDP34 Principles of reliable data transfer

                                                                                                                                                                                                            35 Connection-oriented transport TCP

                                                                                                                                                                                                            segment structurereliable data transferflow controlconnection management

                                                                                                                                                                                                            36 Principles of congestion control37 TCP congestion control

                                                                                                                                                                                                            3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                            Congwin

                                                                                                                                                                                                            w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                            throughput = w MSSRTT Bytessec

                                                                                                                                                                                                            3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                            To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                            Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                            LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                            How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                            three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                            3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                            TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                            CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                            cut CongWin in half after loss event

                                                                                                                                                                                                            8 Kbytes

                                                                                                                                                                                                            16 Kbytes

                                                                                                                                                                                                            24 Kbytes

                                                                                                                                                                                                            time

                                                                                                                                                                                                            congestionwindow

                                                                                                                                                                                                            Long-lived TCP connection

                                                                                                                                                                                                            3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Slow Start

                                                                                                                                                                                                            When connection begins CongWin = 1 MSS

                                                                                                                                                                                                            Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                            available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                            desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                            When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                            3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Slow Start (more)

                                                                                                                                                                                                            When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                            double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                            Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                            Host A

                                                                                                                                                                                                            one segment

                                                                                                                                                                                                            RTT

                                                                                                                                                                                                            Host B

                                                                                                                                                                                                            time

                                                                                                                                                                                                            two segments

                                                                                                                                                                                                            four segments

                                                                                                                                                                                                            3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                            So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                            Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                            bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                            bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                            3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                            Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                            Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                            TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                            Summary TCP Congestion Control

                                                                                                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                            The Big Picture

                                                                                                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                                                            Slow Start (SS)

                                                                                                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                                                            CongestionAvoidance (CA)

                                                                                                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                            Enter slow start

                                                                                                                                                                                                            Duplicate ACK

                                                                                                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                            CongWin and Threshold not changed

                                                                                                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                            TCP throughput

                                                                                                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Futures

                                                                                                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                            LRTTMSSsdot221

                                                                                                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                            TCP connection 1

                                                                                                                                                                                                            bottleneckrouter

                                                                                                                                                                                                            capacity R

                                                                                                                                                                                                            TCP connection 2

                                                                                                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                            R

                                                                                                                                                                                                            R

                                                                                                                                                                                                            equal bandwidth share

                                                                                                                                                                                                            Connection 1 throughput

                                                                                                                                                                                                            Conn

                                                                                                                                                                                                            ecti

                                                                                                                                                                                                            on 2

                                                                                                                                                                                                            thr

                                                                                                                                                                                                            ough

                                                                                                                                                                                                            p ut

                                                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                                                                                                            do not want rate throttled by congestion control

                                                                                                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                            modeling slow start

                                                                                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                            Fixed congestion window (1)

                                                                                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                            latency = 2RTT + OR

                                                                                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                            Fixed congestion window (2)

                                                                                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                            Will show that the delay for one object is

                                                                                                                                                                                                            RS

                                                                                                                                                                                                            RSRTTP

                                                                                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                            RTT

                                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                                            requestobject

                                                                                                                                                                                                            first window= SR

                                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                                            delivered

                                                                                                                                                                                                            time atclient

                                                                                                                                                                                                            time atserver

                                                                                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                            Server idles P=2 times

                                                                                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                            RS

                                                                                                                                                                                                            RSRTTPRTT

                                                                                                                                                                                                            RO

                                                                                                                                                                                                            RSRTT

                                                                                                                                                                                                            RSRTT

                                                                                                                                                                                                            RO

                                                                                                                                                                                                            idleTimeRTTRO

                                                                                                                                                                                                            P

                                                                                                                                                                                                            kP

                                                                                                                                                                                                            k

                                                                                                                                                                                                            P

                                                                                                                                                                                                            pp

                                                                                                                                                                                                            )12(][2

                                                                                                                                                                                                            ]2[2

                                                                                                                                                                                                            2delay

                                                                                                                                                                                                            1

                                                                                                                                                                                                            1

                                                                                                                                                                                                            1

                                                                                                                                                                                                            minusminus+++=

                                                                                                                                                                                                            minus+++=

                                                                                                                                                                                                            ++=

                                                                                                                                                                                                            minus

                                                                                                                                                                                                            =

                                                                                                                                                                                                            =

                                                                                                                                                                                                            sum

                                                                                                                                                                                                            sum

                                                                                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                            RS k =⎥⎦

                                                                                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                            +minus

                                                                                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                                                                                            RSk

                                                                                                                                                                                                            RTT

                                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                                            requestobject

                                                                                                                                                                                                            first window= SR

                                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                                            delivered

                                                                                                                                                                                                            time atclient

                                                                                                                                                                                                            time atserver

                                                                                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                            How do we calculate K

                                                                                                                                                                                                            ⎥⎥⎤

                                                                                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                                                                                            +ge=

                                                                                                                                                                                                            geminus=

                                                                                                                                                                                                            ge+++=

                                                                                                                                                                                                            ge+++=minus

                                                                                                                                                                                                            minus

                                                                                                                                                                                                            )1(log

                                                                                                                                                                                                            )1(logmin

                                                                                                                                                                                                            12min

                                                                                                                                                                                                            222min222min

                                                                                                                                                                                                            2

                                                                                                                                                                                                            2

                                                                                                                                                                                                            110

                                                                                                                                                                                                            110

                                                                                                                                                                                                            SO

                                                                                                                                                                                                            SOkk

                                                                                                                                                                                                            SOk

                                                                                                                                                                                                            SOkOSSSkK

                                                                                                                                                                                                            k

                                                                                                                                                                                                            k

                                                                                                                                                                                                            k

                                                                                                                                                                                                            L

                                                                                                                                                                                                            L

                                                                                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                            02468

                                                                                                                                                                                                            101214161820

                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                            persistent

                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                                                                            0

                                                                                                                                                                                                            10

                                                                                                                                                                                                            20

                                                                                                                                                                                                            30

                                                                                                                                                                                                            40

                                                                                                                                                                                                            50

                                                                                                                                                                                                            60

                                                                                                                                                                                                            70

                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                            persistent

                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                                                                            UDPTCP

                                                                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • Transport services and protocols
                                                                                                                                                                                                            • Transport vs network layer
                                                                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                            • How demultiplexing works
                                                                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                            • UDP more
                                                                                                                                                                                                            • UDP checksum
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                            • Incremental Improvements
                                                                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                            • rdt21 discussion
                                                                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                                                                            • rdt30 sender
                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                            • Performance of rdt30
                                                                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                                                                            • Go-Back-N
                                                                                                                                                                                                            • GBN Sender
                                                                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                                                                            • More on receiver
                                                                                                                                                                                                            • GBN inaction
                                                                                                                                                                                                            • Selective Repeat
                                                                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                                                                            • Selective repeat
                                                                                                                                                                                                            • Selective repeat in action
                                                                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                            • More TCP Details
                                                                                                                                                                                                            • Even More TCP Details
                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                            • Example RTT estimation
                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                                                                            • TCP sender events
                                                                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                            • More on Sender Policies
                                                                                                                                                                                                            • Fast Retransmit
                                                                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                                                                            • Technical Issue
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • TCP Connection Management
                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                            • A few special cases
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                                                                            • TCP AIMD
                                                                                                                                                                                                            • TCP Slow Start
                                                                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                                                                            • The Big Picture
                                                                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                                                                            • TCP throughput
                                                                                                                                                                                                            • TCP Futures
                                                                                                                                                                                                            • TCP Fairness
                                                                                                                                                                                                            • Why is TCP fair
                                                                                                                                                                                                            • Fairness (more)
                                                                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                                                                            • HTTP Modeling
                                                                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                                                                              3 Transport Layer 103Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size Congwin over segments Congwin dynamically modified to reflect perceived congestion

                                                                                                                                                                                                              Congwin

                                                                                                                                                                                                              w segments each with MSS bytes sent in one RTT

                                                                                                                                                                                                              throughput = w MSSRTT Bytessec

                                                                                                                                                                                                              3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                              To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                              Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                              LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                              How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                              three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                              3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                              TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                              CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                              cut CongWin in half after loss event

                                                                                                                                                                                                              8 Kbytes

                                                                                                                                                                                                              16 Kbytes

                                                                                                                                                                                                              24 Kbytes

                                                                                                                                                                                                              time

                                                                                                                                                                                                              congestionwindow

                                                                                                                                                                                                              Long-lived TCP connection

                                                                                                                                                                                                              3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Slow Start

                                                                                                                                                                                                              When connection begins CongWin = 1 MSS

                                                                                                                                                                                                              Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                              available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                              desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                              When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                              3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Slow Start (more)

                                                                                                                                                                                                              When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                              double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                              Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                              Host A

                                                                                                                                                                                                              one segment

                                                                                                                                                                                                              RTT

                                                                                                                                                                                                              Host B

                                                                                                                                                                                                              time

                                                                                                                                                                                                              two segments

                                                                                                                                                                                                              four segments

                                                                                                                                                                                                              3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                              So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                              Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                              bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                              bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                              3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                              Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                              Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                              TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                              3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                              Summary TCP Congestion Control

                                                                                                                                                                                                              When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                              When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                              When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                              When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                              The Big Picture

                                                                                                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                                                              Slow Start (SS)

                                                                                                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                                                              CongestionAvoidance (CA)

                                                                                                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                              Enter slow start

                                                                                                                                                                                                              Duplicate ACK

                                                                                                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                              CongWin and Threshold not changed

                                                                                                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                              TCP throughput

                                                                                                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Futures

                                                                                                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                              LRTTMSSsdot221

                                                                                                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                              TCP connection 1

                                                                                                                                                                                                              bottleneckrouter

                                                                                                                                                                                                              capacity R

                                                                                                                                                                                                              TCP connection 2

                                                                                                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                              R

                                                                                                                                                                                                              R

                                                                                                                                                                                                              equal bandwidth share

                                                                                                                                                                                                              Connection 1 throughput

                                                                                                                                                                                                              Conn

                                                                                                                                                                                                              ecti

                                                                                                                                                                                                              on 2

                                                                                                                                                                                                              thr

                                                                                                                                                                                                              ough

                                                                                                                                                                                                              p ut

                                                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                                                                                                              do not want rate throttled by congestion control

                                                                                                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                              modeling slow start

                                                                                                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                              Fixed congestion window (1)

                                                                                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                              latency = 2RTT + OR

                                                                                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                              Fixed congestion window (2)

                                                                                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                              Will show that the delay for one object is

                                                                                                                                                                                                              RS

                                                                                                                                                                                                              RSRTTP

                                                                                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                              RTT

                                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                                              requestobject

                                                                                                                                                                                                              first window= SR

                                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                                              delivered

                                                                                                                                                                                                              time atclient

                                                                                                                                                                                                              time atserver

                                                                                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                              Server idles P=2 times

                                                                                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                              RS

                                                                                                                                                                                                              RSRTTPRTT

                                                                                                                                                                                                              RO

                                                                                                                                                                                                              RSRTT

                                                                                                                                                                                                              RSRTT

                                                                                                                                                                                                              RO

                                                                                                                                                                                                              idleTimeRTTRO

                                                                                                                                                                                                              P

                                                                                                                                                                                                              kP

                                                                                                                                                                                                              k

                                                                                                                                                                                                              P

                                                                                                                                                                                                              pp

                                                                                                                                                                                                              )12(][2

                                                                                                                                                                                                              ]2[2

                                                                                                                                                                                                              2delay

                                                                                                                                                                                                              1

                                                                                                                                                                                                              1

                                                                                                                                                                                                              1

                                                                                                                                                                                                              minusminus+++=

                                                                                                                                                                                                              minus+++=

                                                                                                                                                                                                              ++=

                                                                                                                                                                                                              minus

                                                                                                                                                                                                              =

                                                                                                                                                                                                              =

                                                                                                                                                                                                              sum

                                                                                                                                                                                                              sum

                                                                                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                              RS k =⎥⎦

                                                                                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                              +minus

                                                                                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                                                                                              RSk

                                                                                                                                                                                                              RTT

                                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                                              requestobject

                                                                                                                                                                                                              first window= SR

                                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                                              delivered

                                                                                                                                                                                                              time atclient

                                                                                                                                                                                                              time atserver

                                                                                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                              How do we calculate K

                                                                                                                                                                                                              ⎥⎥⎤

                                                                                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                                                                                              +ge=

                                                                                                                                                                                                              geminus=

                                                                                                                                                                                                              ge+++=

                                                                                                                                                                                                              ge+++=minus

                                                                                                                                                                                                              minus

                                                                                                                                                                                                              )1(log

                                                                                                                                                                                                              )1(logmin

                                                                                                                                                                                                              12min

                                                                                                                                                                                                              222min222min

                                                                                                                                                                                                              2

                                                                                                                                                                                                              2

                                                                                                                                                                                                              110

                                                                                                                                                                                                              110

                                                                                                                                                                                                              SO

                                                                                                                                                                                                              SOkk

                                                                                                                                                                                                              SOk

                                                                                                                                                                                                              SOkOSSSkK

                                                                                                                                                                                                              k

                                                                                                                                                                                                              k

                                                                                                                                                                                                              k

                                                                                                                                                                                                              L

                                                                                                                                                                                                              L

                                                                                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                              02468

                                                                                                                                                                                                              101214161820

                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                              persistent

                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                                                                              0

                                                                                                                                                                                                              10

                                                                                                                                                                                                              20

                                                                                                                                                                                                              30

                                                                                                                                                                                                              40

                                                                                                                                                                                                              50

                                                                                                                                                                                                              60

                                                                                                                                                                                                              70

                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                              persistent

                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                                                                              UDPTCP

                                                                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • Transport services and protocols
                                                                                                                                                                                                              • Transport vs network layer
                                                                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                              • How demultiplexing works
                                                                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                              • UDP more
                                                                                                                                                                                                              • UDP checksum
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                              • Incremental Improvements
                                                                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                              • rdt21 discussion
                                                                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                                                                              • rdt30 sender
                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                              • Performance of rdt30
                                                                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                                                                              • Go-Back-N
                                                                                                                                                                                                              • GBN Sender
                                                                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                                                                              • More on receiver
                                                                                                                                                                                                              • GBN inaction
                                                                                                                                                                                                              • Selective Repeat
                                                                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                                                                              • Selective repeat
                                                                                                                                                                                                              • Selective repeat in action
                                                                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                              • More TCP Details
                                                                                                                                                                                                              • Even More TCP Details
                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                              • Example RTT estimation
                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                                                                              • TCP sender events
                                                                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                              • More on Sender Policies
                                                                                                                                                                                                              • Fast Retransmit
                                                                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                                                                              • Technical Issue
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • TCP Connection Management
                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                              • A few special cases
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                                                                              • TCP AIMD
                                                                                                                                                                                                              • TCP Slow Start
                                                                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                                                                              • The Big Picture
                                                                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                                                                              • TCP throughput
                                                                                                                                                                                                              • TCP Futures
                                                                                                                                                                                                              • TCP Fairness
                                                                                                                                                                                                              • Why is TCP fair
                                                                                                                                                                                                              • Fairness (more)
                                                                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                                                                              • HTTP Modeling
                                                                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                                                                3 Transport Layer 104Comp 361 Spring 2005

                                                                                                                                                                                                                To simplify presentation we assume that RcvBufferis large enough that it will not overflow

                                                                                                                                                                                                                Tools are ldquosimilarrdquo to flow control sender limits transmission using

                                                                                                                                                                                                                LastByteSent-LastByteAcked le CongWin

                                                                                                                                                                                                                How does sender perceive congestionloss event = timeout or 3 duplicate acksTCP sender reduces rate (CongWin) after loss event

                                                                                                                                                                                                                three mechanismsAIMD = Additive Increase Multiplicative Decreaseslow start = CongWin set to 1 and then grows exponentiallyconservative after timeout events

                                                                                                                                                                                                                3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                                TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                                CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                                cut CongWin in half after loss event

                                                                                                                                                                                                                8 Kbytes

                                                                                                                                                                                                                16 Kbytes

                                                                                                                                                                                                                24 Kbytes

                                                                                                                                                                                                                time

                                                                                                                                                                                                                congestionwindow

                                                                                                                                                                                                                Long-lived TCP connection

                                                                                                                                                                                                                3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Slow Start

                                                                                                                                                                                                                When connection begins CongWin = 1 MSS

                                                                                                                                                                                                                Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                                available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                                desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                                When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                                3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Slow Start (more)

                                                                                                                                                                                                                When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                                double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                                Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                                Host A

                                                                                                                                                                                                                one segment

                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                Host B

                                                                                                                                                                                                                time

                                                                                                                                                                                                                two segments

                                                                                                                                                                                                                four segments

                                                                                                                                                                                                                3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                                So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                                Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                                bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                                bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                                3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                                Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                                Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                                TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                                3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                Summary TCP Congestion Control

                                                                                                                                                                                                                When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                The Big Picture

                                                                                                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                                                Slow Start (SS)

                                                                                                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                Enter slow start

                                                                                                                                                                                                                Duplicate ACK

                                                                                                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                CongWin and Threshold not changed

                                                                                                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                TCP throughput

                                                                                                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Futures

                                                                                                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                LRTTMSSsdot221

                                                                                                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                TCP connection 1

                                                                                                                                                                                                                bottleneckrouter

                                                                                                                                                                                                                capacity R

                                                                                                                                                                                                                TCP connection 2

                                                                                                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                R

                                                                                                                                                                                                                R

                                                                                                                                                                                                                equal bandwidth share

                                                                                                                                                                                                                Connection 1 throughput

                                                                                                                                                                                                                Conn

                                                                                                                                                                                                                ecti

                                                                                                                                                                                                                on 2

                                                                                                                                                                                                                thr

                                                                                                                                                                                                                ough

                                                                                                                                                                                                                p ut

                                                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                modeling slow start

                                                                                                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                                                                                RS

                                                                                                                                                                                                                RSRTTP

                                                                                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                                requestobject

                                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                                delivered

                                                                                                                                                                                                                time atclient

                                                                                                                                                                                                                time atserver

                                                                                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                Server idles P=2 times

                                                                                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                RS

                                                                                                                                                                                                                RSRTTPRTT

                                                                                                                                                                                                                RO

                                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                                RO

                                                                                                                                                                                                                idleTimeRTTRO

                                                                                                                                                                                                                P

                                                                                                                                                                                                                kP

                                                                                                                                                                                                                k

                                                                                                                                                                                                                P

                                                                                                                                                                                                                pp

                                                                                                                                                                                                                )12(][2

                                                                                                                                                                                                                ]2[2

                                                                                                                                                                                                                2delay

                                                                                                                                                                                                                1

                                                                                                                                                                                                                1

                                                                                                                                                                                                                1

                                                                                                                                                                                                                minusminus+++=

                                                                                                                                                                                                                minus+++=

                                                                                                                                                                                                                ++=

                                                                                                                                                                                                                minus

                                                                                                                                                                                                                =

                                                                                                                                                                                                                =

                                                                                                                                                                                                                sum

                                                                                                                                                                                                                sum

                                                                                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                RS k =⎥⎦

                                                                                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                +minus

                                                                                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                RSk

                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                                requestobject

                                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                                delivered

                                                                                                                                                                                                                time atclient

                                                                                                                                                                                                                time atserver

                                                                                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                How do we calculate K

                                                                                                                                                                                                                ⎥⎥⎤

                                                                                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                                                                                +ge=

                                                                                                                                                                                                                geminus=

                                                                                                                                                                                                                ge+++=

                                                                                                                                                                                                                ge+++=minus

                                                                                                                                                                                                                minus

                                                                                                                                                                                                                )1(log

                                                                                                                                                                                                                )1(logmin

                                                                                                                                                                                                                12min

                                                                                                                                                                                                                222min222min

                                                                                                                                                                                                                2

                                                                                                                                                                                                                2

                                                                                                                                                                                                                110

                                                                                                                                                                                                                110

                                                                                                                                                                                                                SO

                                                                                                                                                                                                                SOkk

                                                                                                                                                                                                                SOk

                                                                                                                                                                                                                SOkOSSSkK

                                                                                                                                                                                                                k

                                                                                                                                                                                                                k

                                                                                                                                                                                                                k

                                                                                                                                                                                                                L

                                                                                                                                                                                                                L

                                                                                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                02468

                                                                                                                                                                                                                101214161820

                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                                                                0

                                                                                                                                                                                                                10

                                                                                                                                                                                                                20

                                                                                                                                                                                                                30

                                                                                                                                                                                                                40

                                                                                                                                                                                                                50

                                                                                                                                                                                                                60

                                                                                                                                                                                                                70

                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                                                                UDPTCP

                                                                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                • UDP more
                                                                                                                                                                                                                • UDP checksum
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                                                                • rdt30 sender
                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                                                                • Go-Back-N
                                                                                                                                                                                                                • GBN Sender
                                                                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                                                                • More on receiver
                                                                                                                                                                                                                • GBN inaction
                                                                                                                                                                                                                • Selective Repeat
                                                                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                                                                • Selective repeat
                                                                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                • More TCP Details
                                                                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                                                                • TCP sender events
                                                                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                                                                • Technical Issue
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                • A few special cases
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                                                                • TCP AIMD
                                                                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                                                                • The Big Picture
                                                                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                                                                • TCP throughput
                                                                                                                                                                                                                • TCP Futures
                                                                                                                                                                                                                • TCP Fairness
                                                                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                                                                • Fairness (more)
                                                                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                                                                  3 Transport Layer 105Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP AIMDmultiplicative decrease additive increase increase

                                                                                                                                                                                                                  CongWin by 1 MSS every RTT in the absence of loss events probing also known ascongestion avoidance

                                                                                                                                                                                                                  cut CongWin in half after loss event

                                                                                                                                                                                                                  8 Kbytes

                                                                                                                                                                                                                  16 Kbytes

                                                                                                                                                                                                                  24 Kbytes

                                                                                                                                                                                                                  time

                                                                                                                                                                                                                  congestionwindow

                                                                                                                                                                                                                  Long-lived TCP connection

                                                                                                                                                                                                                  3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Slow Start

                                                                                                                                                                                                                  When connection begins CongWin = 1 MSS

                                                                                                                                                                                                                  Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                                  available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                                  desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                                  When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                                  3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Slow Start (more)

                                                                                                                                                                                                                  When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                                  double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                                  Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                                  Host A

                                                                                                                                                                                                                  one segment

                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                  Host B

                                                                                                                                                                                                                  time

                                                                                                                                                                                                                  two segments

                                                                                                                                                                                                                  four segments

                                                                                                                                                                                                                  3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                                  So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                                  Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                                  bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                                  bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                                  3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                                  Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                                  Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                                  TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                                  3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                  Summary TCP Congestion Control

                                                                                                                                                                                                                  When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                  When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                  When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                  When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                  3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                  The Big Picture

                                                                                                                                                                                                                  3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                                                                  Slow Start (SS)

                                                                                                                                                                                                                  CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                  set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                  Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                  ACK receipt for previously unackeddata

                                                                                                                                                                                                                  CongestionAvoidance (CA)

                                                                                                                                                                                                                  CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                  Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                  Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                  SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                  Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                  Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                  Enter slow start

                                                                                                                                                                                                                  Duplicate ACK

                                                                                                                                                                                                                  SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                  CongWin and Threshold not changed

                                                                                                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP throughput

                                                                                                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Futures

                                                                                                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                  LRTTMSSsdot221

                                                                                                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                  TCP connection 1

                                                                                                                                                                                                                  bottleneckrouter

                                                                                                                                                                                                                  capacity R

                                                                                                                                                                                                                  TCP connection 2

                                                                                                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                  R

                                                                                                                                                                                                                  R

                                                                                                                                                                                                                  equal bandwidth share

                                                                                                                                                                                                                  Connection 1 throughput

                                                                                                                                                                                                                  Conn

                                                                                                                                                                                                                  ecti

                                                                                                                                                                                                                  on 2

                                                                                                                                                                                                                  thr

                                                                                                                                                                                                                  ough

                                                                                                                                                                                                                  p ut

                                                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                  modeling slow start

                                                                                                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                  Fixed congestion window (1)

                                                                                                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                  latency = 2RTT + OR

                                                                                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                                                                                  RS

                                                                                                                                                                                                                  RSRTTP

                                                                                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                                  delivered

                                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                  Server idles P=2 times

                                                                                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                  RS

                                                                                                                                                                                                                  RSRTTPRTT

                                                                                                                                                                                                                  RO

                                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                                  RO

                                                                                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                                                                                  P

                                                                                                                                                                                                                  kP

                                                                                                                                                                                                                  k

                                                                                                                                                                                                                  P

                                                                                                                                                                                                                  pp

                                                                                                                                                                                                                  )12(][2

                                                                                                                                                                                                                  ]2[2

                                                                                                                                                                                                                  2delay

                                                                                                                                                                                                                  1

                                                                                                                                                                                                                  1

                                                                                                                                                                                                                  1

                                                                                                                                                                                                                  minusminus+++=

                                                                                                                                                                                                                  minus+++=

                                                                                                                                                                                                                  ++=

                                                                                                                                                                                                                  minus

                                                                                                                                                                                                                  =

                                                                                                                                                                                                                  =

                                                                                                                                                                                                                  sum

                                                                                                                                                                                                                  sum

                                                                                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                  +minus

                                                                                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                  RSk

                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                                  delivered

                                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                  How do we calculate K

                                                                                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                                                                                  +ge=

                                                                                                                                                                                                                  geminus=

                                                                                                                                                                                                                  ge+++=

                                                                                                                                                                                                                  ge+++=minus

                                                                                                                                                                                                                  minus

                                                                                                                                                                                                                  )1(log

                                                                                                                                                                                                                  )1(logmin

                                                                                                                                                                                                                  12min

                                                                                                                                                                                                                  222min222min

                                                                                                                                                                                                                  2

                                                                                                                                                                                                                  2

                                                                                                                                                                                                                  110

                                                                                                                                                                                                                  110

                                                                                                                                                                                                                  SO

                                                                                                                                                                                                                  SOkk

                                                                                                                                                                                                                  SOk

                                                                                                                                                                                                                  SOkOSSSkK

                                                                                                                                                                                                                  k

                                                                                                                                                                                                                  k

                                                                                                                                                                                                                  k

                                                                                                                                                                                                                  L

                                                                                                                                                                                                                  L

                                                                                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                  02468

                                                                                                                                                                                                                  101214161820

                                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                                  persistent

                                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                                                                                  0

                                                                                                                                                                                                                  10

                                                                                                                                                                                                                  20

                                                                                                                                                                                                                  30

                                                                                                                                                                                                                  40

                                                                                                                                                                                                                  50

                                                                                                                                                                                                                  60

                                                                                                                                                                                                                  70

                                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                                  persistent

                                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                                                                  UDPTCP

                                                                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                  • UDP more
                                                                                                                                                                                                                  • UDP checksum
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                                                                  • GBN Sender
                                                                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                                                                  • More on receiver
                                                                                                                                                                                                                  • GBN inaction
                                                                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                                                                  • Selective repeat
                                                                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                  • More TCP Details
                                                                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                                                                  • TCP sender events
                                                                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                                                                  • Technical Issue
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                  • A few special cases
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                                                                  • The Big Picture
                                                                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                                                                  • TCP throughput
                                                                                                                                                                                                                  • TCP Futures
                                                                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                                                                    3 Transport Layer 106Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Slow Start

                                                                                                                                                                                                                    When connection begins CongWin = 1 MSS

                                                                                                                                                                                                                    Example MSS = 500 bytes amp RTT = 200 msecinitial rate = 20 kbps

                                                                                                                                                                                                                    available bandwidth may be gtgt MSSRTT

                                                                                                                                                                                                                    desirable to quickly ramp up to respectable rate

                                                                                                                                                                                                                    When connection begins increase rate exponentially fast until first loss event

                                                                                                                                                                                                                    3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Slow Start (more)

                                                                                                                                                                                                                    When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                                    double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                                    Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                                    Host A

                                                                                                                                                                                                                    one segment

                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                    Host B

                                                                                                                                                                                                                    time

                                                                                                                                                                                                                    two segments

                                                                                                                                                                                                                    four segments

                                                                                                                                                                                                                    3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                                    So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                                    Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                                    bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                                    bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                                    3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                                    Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                                    Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                                    TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                                    3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                    Summary TCP Congestion Control

                                                                                                                                                                                                                    When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                    When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                    When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                    When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                    3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                    The Big Picture

                                                                                                                                                                                                                    3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                                                                    Slow Start (SS)

                                                                                                                                                                                                                    CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                    set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                    Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                    ACK receipt for previously unackeddata

                                                                                                                                                                                                                    CongestionAvoidance (CA)

                                                                                                                                                                                                                    CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                    Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                    Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                    SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                    Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                    Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                    Enter slow start

                                                                                                                                                                                                                    Duplicate ACK

                                                                                                                                                                                                                    SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                    CongWin and Threshold not changed

                                                                                                                                                                                                                    3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP throughput

                                                                                                                                                                                                                    Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                    Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Futures

                                                                                                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                    LRTTMSSsdot221

                                                                                                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                    TCP connection 1

                                                                                                                                                                                                                    bottleneckrouter

                                                                                                                                                                                                                    capacity R

                                                                                                                                                                                                                    TCP connection 2

                                                                                                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                    R

                                                                                                                                                                                                                    R

                                                                                                                                                                                                                    equal bandwidth share

                                                                                                                                                                                                                    Connection 1 throughput

                                                                                                                                                                                                                    Conn

                                                                                                                                                                                                                    ecti

                                                                                                                                                                                                                    on 2

                                                                                                                                                                                                                    thr

                                                                                                                                                                                                                    ough

                                                                                                                                                                                                                    p ut

                                                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                    modeling slow start

                                                                                                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                    Fixed congestion window (1)

                                                                                                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                    latency = 2RTT + OR

                                                                                                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                    Fixed congestion window (2)

                                                                                                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                                                                                    RS

                                                                                                                                                                                                                    RSRTTP

                                                                                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                                    delivered

                                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                    Server idles P=2 times

                                                                                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                    RS

                                                                                                                                                                                                                    RSRTTPRTT

                                                                                                                                                                                                                    RO

                                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                                    RO

                                                                                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                                                                                    P

                                                                                                                                                                                                                    kP

                                                                                                                                                                                                                    k

                                                                                                                                                                                                                    P

                                                                                                                                                                                                                    pp

                                                                                                                                                                                                                    )12(][2

                                                                                                                                                                                                                    ]2[2

                                                                                                                                                                                                                    2delay

                                                                                                                                                                                                                    1

                                                                                                                                                                                                                    1

                                                                                                                                                                                                                    1

                                                                                                                                                                                                                    minusminus+++=

                                                                                                                                                                                                                    minus+++=

                                                                                                                                                                                                                    ++=

                                                                                                                                                                                                                    minus

                                                                                                                                                                                                                    =

                                                                                                                                                                                                                    =

                                                                                                                                                                                                                    sum

                                                                                                                                                                                                                    sum

                                                                                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                    +minus

                                                                                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                    RSk

                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                                    delivered

                                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                    How do we calculate K

                                                                                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                                                                                    +ge=

                                                                                                                                                                                                                    geminus=

                                                                                                                                                                                                                    ge+++=

                                                                                                                                                                                                                    ge+++=minus

                                                                                                                                                                                                                    minus

                                                                                                                                                                                                                    )1(log

                                                                                                                                                                                                                    )1(logmin

                                                                                                                                                                                                                    12min

                                                                                                                                                                                                                    222min222min

                                                                                                                                                                                                                    2

                                                                                                                                                                                                                    2

                                                                                                                                                                                                                    110

                                                                                                                                                                                                                    110

                                                                                                                                                                                                                    SO

                                                                                                                                                                                                                    SOkk

                                                                                                                                                                                                                    SOk

                                                                                                                                                                                                                    SOkOSSSkK

                                                                                                                                                                                                                    k

                                                                                                                                                                                                                    k

                                                                                                                                                                                                                    k

                                                                                                                                                                                                                    L

                                                                                                                                                                                                                    L

                                                                                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                    02468

                                                                                                                                                                                                                    101214161820

                                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                                    persistent

                                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                                                                                    0

                                                                                                                                                                                                                    10

                                                                                                                                                                                                                    20

                                                                                                                                                                                                                    30

                                                                                                                                                                                                                    40

                                                                                                                                                                                                                    50

                                                                                                                                                                                                                    60

                                                                                                                                                                                                                    70

                                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                                    persistent

                                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                                                                                    UDPTCP

                                                                                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • Transport services and protocols
                                                                                                                                                                                                                    • Transport vs network layer
                                                                                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                                    • How demultiplexing works
                                                                                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                    • UDP more
                                                                                                                                                                                                                    • UDP checksum
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                                    • Incremental Improvements
                                                                                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                    • rdt21 discussion
                                                                                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                                                                                    • rdt30 sender
                                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                                    • Performance of rdt30
                                                                                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                                                                                    • Go-Back-N
                                                                                                                                                                                                                    • GBN Sender
                                                                                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                                                                                    • More on receiver
                                                                                                                                                                                                                    • GBN inaction
                                                                                                                                                                                                                    • Selective Repeat
                                                                                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                                                                                    • Selective repeat
                                                                                                                                                                                                                    • Selective repeat in action
                                                                                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                    • More TCP Details
                                                                                                                                                                                                                    • Even More TCP Details
                                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                    • Example RTT estimation
                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                                                                                    • TCP sender events
                                                                                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                    • More on Sender Policies
                                                                                                                                                                                                                    • Fast Retransmit
                                                                                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                                                                                    • Technical Issue
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • TCP Connection Management
                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                    • A few special cases
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                                                                                    • TCP AIMD
                                                                                                                                                                                                                    • TCP Slow Start
                                                                                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                                                                                    • The Big Picture
                                                                                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                                                                                    • TCP throughput
                                                                                                                                                                                                                    • TCP Futures
                                                                                                                                                                                                                    • TCP Fairness
                                                                                                                                                                                                                    • Why is TCP fair
                                                                                                                                                                                                                    • Fairness (more)
                                                                                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                                                                                    • HTTP Modeling
                                                                                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                                                                                      3 Transport Layer 107Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Slow Start (more)

                                                                                                                                                                                                                      When connection begins increase rate exponentially until first loss event

                                                                                                                                                                                                                      double CongWin every RTTdone by incrementing CongWin for every ACK received

                                                                                                                                                                                                                      Summary initial rate is slow but ramps up exponentially fast

                                                                                                                                                                                                                      Host A

                                                                                                                                                                                                                      one segment

                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                      Host B

                                                                                                                                                                                                                      time

                                                                                                                                                                                                                      two segments

                                                                                                                                                                                                                      four segments

                                                                                                                                                                                                                      3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                                      So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                                      Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                                      bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                                      bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                                      3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                                      Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                                      Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                                      TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                                      3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                      Summary TCP Congestion Control

                                                                                                                                                                                                                      When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                      When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                      When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                      When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                      3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                      The Big Picture

                                                                                                                                                                                                                      3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                                                                      Slow Start (SS)

                                                                                                                                                                                                                      CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                      set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                      Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                      ACK receipt for previously unackeddata

                                                                                                                                                                                                                      CongestionAvoidance (CA)

                                                                                                                                                                                                                      CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                      Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                      Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                      SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                      Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                      Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                      Enter slow start

                                                                                                                                                                                                                      Duplicate ACK

                                                                                                                                                                                                                      SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                      CongWin and Threshold not changed

                                                                                                                                                                                                                      3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP throughput

                                                                                                                                                                                                                      Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                      Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                      3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Futures

                                                                                                                                                                                                                      Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                      L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                      LRTTMSSsdot221

                                                                                                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                      TCP connection 1

                                                                                                                                                                                                                      bottleneckrouter

                                                                                                                                                                                                                      capacity R

                                                                                                                                                                                                                      TCP connection 2

                                                                                                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                      R

                                                                                                                                                                                                                      R

                                                                                                                                                                                                                      equal bandwidth share

                                                                                                                                                                                                                      Connection 1 throughput

                                                                                                                                                                                                                      Conn

                                                                                                                                                                                                                      ecti

                                                                                                                                                                                                                      on 2

                                                                                                                                                                                                                      thr

                                                                                                                                                                                                                      ough

                                                                                                                                                                                                                      p ut

                                                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                      modeling slow start

                                                                                                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                      Fixed congestion window (1)

                                                                                                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                      latency = 2RTT + OR

                                                                                                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                      Fixed congestion window (2)

                                                                                                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                      Will show that the delay for one object is

                                                                                                                                                                                                                      RS

                                                                                                                                                                                                                      RSRTTP

                                                                                                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                                      delivered

                                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                      Server idles P=2 times

                                                                                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                      RS

                                                                                                                                                                                                                      RSRTTPRTT

                                                                                                                                                                                                                      RO

                                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                                      RO

                                                                                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                                                                                      P

                                                                                                                                                                                                                      kP

                                                                                                                                                                                                                      k

                                                                                                                                                                                                                      P

                                                                                                                                                                                                                      pp

                                                                                                                                                                                                                      )12(][2

                                                                                                                                                                                                                      ]2[2

                                                                                                                                                                                                                      2delay

                                                                                                                                                                                                                      1

                                                                                                                                                                                                                      1

                                                                                                                                                                                                                      1

                                                                                                                                                                                                                      minusminus+++=

                                                                                                                                                                                                                      minus+++=

                                                                                                                                                                                                                      ++=

                                                                                                                                                                                                                      minus

                                                                                                                                                                                                                      =

                                                                                                                                                                                                                      =

                                                                                                                                                                                                                      sum

                                                                                                                                                                                                                      sum

                                                                                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                      +minus

                                                                                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                      RSk

                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                                      delivered

                                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                      How do we calculate K

                                                                                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                                                                                      +ge=

                                                                                                                                                                                                                      geminus=

                                                                                                                                                                                                                      ge+++=

                                                                                                                                                                                                                      ge+++=minus

                                                                                                                                                                                                                      minus

                                                                                                                                                                                                                      )1(log

                                                                                                                                                                                                                      )1(logmin

                                                                                                                                                                                                                      12min

                                                                                                                                                                                                                      222min222min

                                                                                                                                                                                                                      2

                                                                                                                                                                                                                      2

                                                                                                                                                                                                                      110

                                                                                                                                                                                                                      110

                                                                                                                                                                                                                      SO

                                                                                                                                                                                                                      SOkk

                                                                                                                                                                                                                      SOk

                                                                                                                                                                                                                      SOkOSSSkK

                                                                                                                                                                                                                      k

                                                                                                                                                                                                                      k

                                                                                                                                                                                                                      k

                                                                                                                                                                                                                      L

                                                                                                                                                                                                                      L

                                                                                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                      02468

                                                                                                                                                                                                                      101214161820

                                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                                      persistent

                                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                                                                                      0

                                                                                                                                                                                                                      10

                                                                                                                                                                                                                      20

                                                                                                                                                                                                                      30

                                                                                                                                                                                                                      40

                                                                                                                                                                                                                      50

                                                                                                                                                                                                                      60

                                                                                                                                                                                                                      70

                                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                                      persistent

                                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                                                                                      UDPTCP

                                                                                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • Transport services and protocols
                                                                                                                                                                                                                      • Transport vs network layer
                                                                                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                                      • How demultiplexing works
                                                                                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                      • UDP more
                                                                                                                                                                                                                      • UDP checksum
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                                      • Incremental Improvements
                                                                                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                      • rdt21 discussion
                                                                                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                                                                                      • rdt30 sender
                                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                                      • Performance of rdt30
                                                                                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                                                                                      • Go-Back-N
                                                                                                                                                                                                                      • GBN Sender
                                                                                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                                                                                      • More on receiver
                                                                                                                                                                                                                      • GBN inaction
                                                                                                                                                                                                                      • Selective Repeat
                                                                                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                                                                                      • Selective repeat
                                                                                                                                                                                                                      • Selective repeat in action
                                                                                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                      • More TCP Details
                                                                                                                                                                                                                      • Even More TCP Details
                                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                      • Example RTT estimation
                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                                                                                      • TCP sender events
                                                                                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                      • More on Sender Policies
                                                                                                                                                                                                                      • Fast Retransmit
                                                                                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                                                                                      • Technical Issue
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • TCP Connection Management
                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                      • A few special cases
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                                                                                      • TCP AIMD
                                                                                                                                                                                                                      • TCP Slow Start
                                                                                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                                                                                      • The Big Picture
                                                                                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                                                                                      • TCP throughput
                                                                                                                                                                                                                      • TCP Futures
                                                                                                                                                                                                                      • TCP Fairness
                                                                                                                                                                                                                      • Why is TCP fair
                                                                                                                                                                                                                      • Fairness (more)
                                                                                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                                                                                      • HTTP Modeling
                                                                                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                                                                                        3 Transport Layer 108Comp 361 Spring 2005

                                                                                                                                                                                                                        So FarSlow-Start ramps up exponentiallyFollowed by AIMD sawtooth pattern

                                                                                                                                                                                                                        Reality (TCP Reno)Introduce new variable thresholdthreshold initially very largeSlow-Start exponential growth stops when reaches threshold and then switches to AIMDTwo different types of loss events

                                                                                                                                                                                                                        bull 3 dup ACKS cut CongWin in half and set threshold=CongWin (now in standard AIMD)

                                                                                                                                                                                                                        bull Timeout set threshold=CongWin2 CongWin=1and switch to Slow-Start

                                                                                                                                                                                                                        3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                                        Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                                        Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                                        TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                                        3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                        Summary TCP Congestion Control

                                                                                                                                                                                                                        When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                        When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                        When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                        When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                        3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                        The Big Picture

                                                                                                                                                                                                                        3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                                                                        Slow Start (SS)

                                                                                                                                                                                                                        CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                        set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                        Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                        ACK receipt for previously unackeddata

                                                                                                                                                                                                                        CongestionAvoidance (CA)

                                                                                                                                                                                                                        CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                        Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                        Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                        SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                        Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                        Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                        Enter slow start

                                                                                                                                                                                                                        Duplicate ACK

                                                                                                                                                                                                                        SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                        CongWin and Threshold not changed

                                                                                                                                                                                                                        3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP throughput

                                                                                                                                                                                                                        Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                        Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                        3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP Futures

                                                                                                                                                                                                                        Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                        L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                        LRTTMSSsdot221

                                                                                                                                                                                                                        3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                        bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                        TCP connection 1

                                                                                                                                                                                                                        bottleneckrouter

                                                                                                                                                                                                                        capacity R

                                                                                                                                                                                                                        TCP connection 2

                                                                                                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                        R

                                                                                                                                                                                                                        R

                                                                                                                                                                                                                        equal bandwidth share

                                                                                                                                                                                                                        Connection 1 throughput

                                                                                                                                                                                                                        Conn

                                                                                                                                                                                                                        ecti

                                                                                                                                                                                                                        on 2

                                                                                                                                                                                                                        thr

                                                                                                                                                                                                                        ough

                                                                                                                                                                                                                        p ut

                                                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                        modeling slow start

                                                                                                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                        Fixed congestion window (1)

                                                                                                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                        latency = 2RTT + OR

                                                                                                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                        Fixed congestion window (2)

                                                                                                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                        Will show that the delay for one object is

                                                                                                                                                                                                                        RS

                                                                                                                                                                                                                        RSRTTP

                                                                                                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                        RTT

                                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                                        delivered

                                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                        Server idles P=2 times

                                                                                                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                        RS

                                                                                                                                                                                                                        RSRTTPRTT

                                                                                                                                                                                                                        RO

                                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                                        RO

                                                                                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                                                                                        P

                                                                                                                                                                                                                        kP

                                                                                                                                                                                                                        k

                                                                                                                                                                                                                        P

                                                                                                                                                                                                                        pp

                                                                                                                                                                                                                        )12(][2

                                                                                                                                                                                                                        ]2[2

                                                                                                                                                                                                                        2delay

                                                                                                                                                                                                                        1

                                                                                                                                                                                                                        1

                                                                                                                                                                                                                        1

                                                                                                                                                                                                                        minusminus+++=

                                                                                                                                                                                                                        minus+++=

                                                                                                                                                                                                                        ++=

                                                                                                                                                                                                                        minus

                                                                                                                                                                                                                        =

                                                                                                                                                                                                                        =

                                                                                                                                                                                                                        sum

                                                                                                                                                                                                                        sum

                                                                                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                        +minus

                                                                                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                        RSk

                                                                                                                                                                                                                        RTT

                                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                                        delivered

                                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                        How do we calculate K

                                                                                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                                                                                        +ge=

                                                                                                                                                                                                                        geminus=

                                                                                                                                                                                                                        ge+++=

                                                                                                                                                                                                                        ge+++=minus

                                                                                                                                                                                                                        minus

                                                                                                                                                                                                                        )1(log

                                                                                                                                                                                                                        )1(logmin

                                                                                                                                                                                                                        12min

                                                                                                                                                                                                                        222min222min

                                                                                                                                                                                                                        2

                                                                                                                                                                                                                        2

                                                                                                                                                                                                                        110

                                                                                                                                                                                                                        110

                                                                                                                                                                                                                        SO

                                                                                                                                                                                                                        SOkk

                                                                                                                                                                                                                        SOk

                                                                                                                                                                                                                        SOkOSSSkK

                                                                                                                                                                                                                        k

                                                                                                                                                                                                                        k

                                                                                                                                                                                                                        k

                                                                                                                                                                                                                        L

                                                                                                                                                                                                                        L

                                                                                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                        02468

                                                                                                                                                                                                                        101214161820

                                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                                        persistent

                                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                                                                                        0

                                                                                                                                                                                                                        10

                                                                                                                                                                                                                        20

                                                                                                                                                                                                                        30

                                                                                                                                                                                                                        40

                                                                                                                                                                                                                        50

                                                                                                                                                                                                                        60

                                                                                                                                                                                                                        70

                                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                                        persistent

                                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                                                                                        UDPTCP

                                                                                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • Transport services and protocols
                                                                                                                                                                                                                        • Transport vs network layer
                                                                                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                                        • How demultiplexing works
                                                                                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                        • UDP more
                                                                                                                                                                                                                        • UDP checksum
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                                        • Incremental Improvements
                                                                                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                        • rdt21 discussion
                                                                                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                                                                                        • rdt30 sender
                                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                                        • Performance of rdt30
                                                                                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                                                                                        • Go-Back-N
                                                                                                                                                                                                                        • GBN Sender
                                                                                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                                                                                        • More on receiver
                                                                                                                                                                                                                        • GBN inaction
                                                                                                                                                                                                                        • Selective Repeat
                                                                                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                                                                                        • Selective repeat
                                                                                                                                                                                                                        • Selective repeat in action
                                                                                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                        • More TCP Details
                                                                                                                                                                                                                        • Even More TCP Details
                                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                        • Example RTT estimation
                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                                                                                        • TCP sender events
                                                                                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                        • More on Sender Policies
                                                                                                                                                                                                                        • Fast Retransmit
                                                                                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                                                                                        • Technical Issue
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • TCP Connection Management
                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                        • A few special cases
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                                                                                        • TCP AIMD
                                                                                                                                                                                                                        • TCP Slow Start
                                                                                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                                                                                        • The Big Picture
                                                                                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                                                                                        • TCP throughput
                                                                                                                                                                                                                        • TCP Futures
                                                                                                                                                                                                                        • TCP Fairness
                                                                                                                                                                                                                        • Why is TCP fair
                                                                                                                                                                                                                        • Fairness (more)
                                                                                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                                                                                        • HTTP Modeling
                                                                                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                                                                                          3 Transport Layer 109Comp 361 Spring 2005

                                                                                                                                                                                                                          Reason for treating 3 dup ACKS differently than timeout is that 3 dup ACKs indicates network capable of delivering some segments while timeout before 3 dup ACKs is ldquomore alarmingrdquo

                                                                                                                                                                                                                          Note that older protocol TCP Tahoe treated both types of loss events the same and always goes to slowstart with Congwin=1 after a loss event

                                                                                                                                                                                                                          TCP Renorsquos skipping of the slow start for a 3-DUP-ACK loss event is known as fast-recovery

                                                                                                                                                                                                                          3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                          Summary TCP Congestion Control

                                                                                                                                                                                                                          When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                          When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                          When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                          When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                          3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                          The Big Picture

                                                                                                                                                                                                                          3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                                                                          Slow Start (SS)

                                                                                                                                                                                                                          CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                          set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                          Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                          ACK receipt for previously unackeddata

                                                                                                                                                                                                                          CongestionAvoidance (CA)

                                                                                                                                                                                                                          CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                          Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                          Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                          SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                          Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                          Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                          Enter slow start

                                                                                                                                                                                                                          Duplicate ACK

                                                                                                                                                                                                                          SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                          CongWin and Threshold not changed

                                                                                                                                                                                                                          3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP throughput

                                                                                                                                                                                                                          Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                          Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                          3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP Futures

                                                                                                                                                                                                                          Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                          L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                          LRTTMSSsdot221

                                                                                                                                                                                                                          3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                          bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                          TCP connection 1

                                                                                                                                                                                                                          bottleneckrouter

                                                                                                                                                                                                                          capacity R

                                                                                                                                                                                                                          TCP connection 2

                                                                                                                                                                                                                          3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                          Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                          Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                          R

                                                                                                                                                                                                                          R

                                                                                                                                                                                                                          equal bandwidth share

                                                                                                                                                                                                                          Connection 1 throughput

                                                                                                                                                                                                                          Conn

                                                                                                                                                                                                                          ecti

                                                                                                                                                                                                                          on 2

                                                                                                                                                                                                                          thr

                                                                                                                                                                                                                          ough

                                                                                                                                                                                                                          p ut

                                                                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                          congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                          modeling slow start

                                                                                                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                          Fixed congestion window (1)

                                                                                                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                          latency = 2RTT + OR

                                                                                                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                          Fixed congestion window (2)

                                                                                                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                          Will show that the delay for one object is

                                                                                                                                                                                                                          RS

                                                                                                                                                                                                                          RSRTTP

                                                                                                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                          RTT

                                                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                                                          requestobject

                                                                                                                                                                                                                          first window= SR

                                                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                                                          delivered

                                                                                                                                                                                                                          time atclient

                                                                                                                                                                                                                          time atserver

                                                                                                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                          Server idles P=2 times

                                                                                                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                          RS

                                                                                                                                                                                                                          RSRTTPRTT

                                                                                                                                                                                                                          RO

                                                                                                                                                                                                                          RSRTT

                                                                                                                                                                                                                          RSRTT

                                                                                                                                                                                                                          RO

                                                                                                                                                                                                                          idleTimeRTTRO

                                                                                                                                                                                                                          P

                                                                                                                                                                                                                          kP

                                                                                                                                                                                                                          k

                                                                                                                                                                                                                          P

                                                                                                                                                                                                                          pp

                                                                                                                                                                                                                          )12(][2

                                                                                                                                                                                                                          ]2[2

                                                                                                                                                                                                                          2delay

                                                                                                                                                                                                                          1

                                                                                                                                                                                                                          1

                                                                                                                                                                                                                          1

                                                                                                                                                                                                                          minusminus+++=

                                                                                                                                                                                                                          minus+++=

                                                                                                                                                                                                                          ++=

                                                                                                                                                                                                                          minus

                                                                                                                                                                                                                          =

                                                                                                                                                                                                                          =

                                                                                                                                                                                                                          sum

                                                                                                                                                                                                                          sum

                                                                                                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                          RS k =⎥⎦

                                                                                                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                          +minus

                                                                                                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                          RSk

                                                                                                                                                                                                                          RTT

                                                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                                                          requestobject

                                                                                                                                                                                                                          first window= SR

                                                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                                                          delivered

                                                                                                                                                                                                                          time atclient

                                                                                                                                                                                                                          time atserver

                                                                                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                          How do we calculate K

                                                                                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                                                                                          +ge=

                                                                                                                                                                                                                          geminus=

                                                                                                                                                                                                                          ge+++=

                                                                                                                                                                                                                          ge+++=minus

                                                                                                                                                                                                                          minus

                                                                                                                                                                                                                          )1(log

                                                                                                                                                                                                                          )1(logmin

                                                                                                                                                                                                                          12min

                                                                                                                                                                                                                          222min222min

                                                                                                                                                                                                                          2

                                                                                                                                                                                                                          2

                                                                                                                                                                                                                          110

                                                                                                                                                                                                                          110

                                                                                                                                                                                                                          SO

                                                                                                                                                                                                                          SOkk

                                                                                                                                                                                                                          SOk

                                                                                                                                                                                                                          SOkOSSSkK

                                                                                                                                                                                                                          k

                                                                                                                                                                                                                          k

                                                                                                                                                                                                                          k

                                                                                                                                                                                                                          L

                                                                                                                                                                                                                          L

                                                                                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                          02468

                                                                                                                                                                                                                          101214161820

                                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                                          persistent

                                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                                                                                          0

                                                                                                                                                                                                                          10

                                                                                                                                                                                                                          20

                                                                                                                                                                                                                          30

                                                                                                                                                                                                                          40

                                                                                                                                                                                                                          50

                                                                                                                                                                                                                          60

                                                                                                                                                                                                                          70

                                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                                          persistent

                                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                                                                                          UDPTCP

                                                                                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • Transport services and protocols
                                                                                                                                                                                                                          • Transport vs network layer
                                                                                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                                          • How demultiplexing works
                                                                                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                          • UDP more
                                                                                                                                                                                                                          • UDP checksum
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                                          • Incremental Improvements
                                                                                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                          • rdt21 discussion
                                                                                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                                                                                          • rdt30 sender
                                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                                          • Performance of rdt30
                                                                                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                                                                                          • Go-Back-N
                                                                                                                                                                                                                          • GBN Sender
                                                                                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                                                                                          • More on receiver
                                                                                                                                                                                                                          • GBN inaction
                                                                                                                                                                                                                          • Selective Repeat
                                                                                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                                                                                          • Selective repeat
                                                                                                                                                                                                                          • Selective repeat in action
                                                                                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                          • More TCP Details
                                                                                                                                                                                                                          • Even More TCP Details
                                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                          • Example RTT estimation
                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                                                                                          • TCP sender events
                                                                                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                          • More on Sender Policies
                                                                                                                                                                                                                          • Fast Retransmit
                                                                                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                                                                                          • Technical Issue
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • TCP Connection Management
                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                          • A few special cases
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                                                                                          • TCP AIMD
                                                                                                                                                                                                                          • TCP Slow Start
                                                                                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                                                                                          • The Big Picture
                                                                                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                                                                                          • TCP throughput
                                                                                                                                                                                                                          • TCP Futures
                                                                                                                                                                                                                          • TCP Fairness
                                                                                                                                                                                                                          • Why is TCP fair
                                                                                                                                                                                                                          • Fairness (more)
                                                                                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                                                                                          • HTTP Modeling
                                                                                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                                                                                            3 Transport Layer 110Comp 361 Spring 2005

                                                                                                                                                                                                                            Summary TCP Congestion Control

                                                                                                                                                                                                                            When CongWin is below Threshold sender in slow-start phase window grows exponentially

                                                                                                                                                                                                                            When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly

                                                                                                                                                                                                                            When a triple duplicate ACK occurs Thresholdset to CongWin2 and CongWin set to Threshold (only in TCP Reno)

                                                                                                                                                                                                                            When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS(TCP Tahoe does this for 3 Dup Acks as well)

                                                                                                                                                                                                                            3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                            The Big Picture

                                                                                                                                                                                                                            3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                                                                            Slow Start (SS)

                                                                                                                                                                                                                            CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                            set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                            Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                            ACK receipt for previously unackeddata

                                                                                                                                                                                                                            CongestionAvoidance (CA)

                                                                                                                                                                                                                            CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                            Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                            Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                            SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                            Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                            Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                            Enter slow start

                                                                                                                                                                                                                            Duplicate ACK

                                                                                                                                                                                                                            SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                            CongWin and Threshold not changed

                                                                                                                                                                                                                            3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP throughput

                                                                                                                                                                                                                            Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                            Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                            3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP Futures

                                                                                                                                                                                                                            Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                            L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                            LRTTMSSsdot221

                                                                                                                                                                                                                            3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                            bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                            TCP connection 1

                                                                                                                                                                                                                            bottleneckrouter

                                                                                                                                                                                                                            capacity R

                                                                                                                                                                                                                            TCP connection 2

                                                                                                                                                                                                                            3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                            Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                            Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                            R

                                                                                                                                                                                                                            R

                                                                                                                                                                                                                            equal bandwidth share

                                                                                                                                                                                                                            Connection 1 throughput

                                                                                                                                                                                                                            Conn

                                                                                                                                                                                                                            ecti

                                                                                                                                                                                                                            on 2

                                                                                                                                                                                                                            thr

                                                                                                                                                                                                                            ough

                                                                                                                                                                                                                            p ut

                                                                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                            congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                            3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                            Fairness (more)Fairness and UDP

                                                                                                                                                                                                                            Multimedia apps often do not use TCP

                                                                                                                                                                                                                            do not want rate throttled by congestion control

                                                                                                                                                                                                                            Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                            Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                            Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                            new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                            modeling slow start

                                                                                                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                            Fixed congestion window (1)

                                                                                                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                            latency = 2RTT + OR

                                                                                                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                            Fixed congestion window (2)

                                                                                                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                            Will show that the delay for one object is

                                                                                                                                                                                                                            RS

                                                                                                                                                                                                                            RSRTTP

                                                                                                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                            RTT

                                                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                                                            requestobject

                                                                                                                                                                                                                            first window= SR

                                                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                                                            delivered

                                                                                                                                                                                                                            time atclient

                                                                                                                                                                                                                            time atserver

                                                                                                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                            Server idles P=2 times

                                                                                                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                            RS

                                                                                                                                                                                                                            RSRTTPRTT

                                                                                                                                                                                                                            RO

                                                                                                                                                                                                                            RSRTT

                                                                                                                                                                                                                            RSRTT

                                                                                                                                                                                                                            RO

                                                                                                                                                                                                                            idleTimeRTTRO

                                                                                                                                                                                                                            P

                                                                                                                                                                                                                            kP

                                                                                                                                                                                                                            k

                                                                                                                                                                                                                            P

                                                                                                                                                                                                                            pp

                                                                                                                                                                                                                            )12(][2

                                                                                                                                                                                                                            ]2[2

                                                                                                                                                                                                                            2delay

                                                                                                                                                                                                                            1

                                                                                                                                                                                                                            1

                                                                                                                                                                                                                            1

                                                                                                                                                                                                                            minusminus+++=

                                                                                                                                                                                                                            minus+++=

                                                                                                                                                                                                                            ++=

                                                                                                                                                                                                                            minus

                                                                                                                                                                                                                            =

                                                                                                                                                                                                                            =

                                                                                                                                                                                                                            sum

                                                                                                                                                                                                                            sum

                                                                                                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                            RS k =⎥⎦

                                                                                                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                            +minus

                                                                                                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                            RSk

                                                                                                                                                                                                                            RTT

                                                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                                                            requestobject

                                                                                                                                                                                                                            first window= SR

                                                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                                                            delivered

                                                                                                                                                                                                                            time atclient

                                                                                                                                                                                                                            time atserver

                                                                                                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                            How do we calculate K

                                                                                                                                                                                                                            ⎥⎥⎤

                                                                                                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                                                                                                            +ge=

                                                                                                                                                                                                                            geminus=

                                                                                                                                                                                                                            ge+++=

                                                                                                                                                                                                                            ge+++=minus

                                                                                                                                                                                                                            minus

                                                                                                                                                                                                                            )1(log

                                                                                                                                                                                                                            )1(logmin

                                                                                                                                                                                                                            12min

                                                                                                                                                                                                                            222min222min

                                                                                                                                                                                                                            2

                                                                                                                                                                                                                            2

                                                                                                                                                                                                                            110

                                                                                                                                                                                                                            110

                                                                                                                                                                                                                            SO

                                                                                                                                                                                                                            SOkk

                                                                                                                                                                                                                            SOk

                                                                                                                                                                                                                            SOkOSSSkK

                                                                                                                                                                                                                            k

                                                                                                                                                                                                                            k

                                                                                                                                                                                                                            k

                                                                                                                                                                                                                            L

                                                                                                                                                                                                                            L

                                                                                                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                            02468

                                                                                                                                                                                                                            101214161820

                                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                                            persistent

                                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                                                                                            0

                                                                                                                                                                                                                            10

                                                                                                                                                                                                                            20

                                                                                                                                                                                                                            30

                                                                                                                                                                                                                            40

                                                                                                                                                                                                                            50

                                                                                                                                                                                                                            60

                                                                                                                                                                                                                            70

                                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                                            persistent

                                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                                                                                            UDPTCP

                                                                                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • Transport services and protocols
                                                                                                                                                                                                                            • Transport vs network layer
                                                                                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                                            • How demultiplexing works
                                                                                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                            • UDP more
                                                                                                                                                                                                                            • UDP checksum
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                                            • Incremental Improvements
                                                                                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                            • rdt21 discussion
                                                                                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                                                                                            • rdt30 sender
                                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                                            • Performance of rdt30
                                                                                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                                                                                            • Go-Back-N
                                                                                                                                                                                                                            • GBN Sender
                                                                                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                                                                                            • More on receiver
                                                                                                                                                                                                                            • GBN inaction
                                                                                                                                                                                                                            • Selective Repeat
                                                                                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                                                                                            • Selective repeat
                                                                                                                                                                                                                            • Selective repeat in action
                                                                                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                            • More TCP Details
                                                                                                                                                                                                                            • Even More TCP Details
                                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                            • Example RTT estimation
                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                                                                                            • TCP sender events
                                                                                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                            • More on Sender Policies
                                                                                                                                                                                                                            • Fast Retransmit
                                                                                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                                                                                            • Technical Issue
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • TCP Connection Management
                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                            • A few special cases
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                                                                                            • TCP AIMD
                                                                                                                                                                                                                            • TCP Slow Start
                                                                                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                                                                                            • The Big Picture
                                                                                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                                                                                            • TCP throughput
                                                                                                                                                                                                                            • TCP Futures
                                                                                                                                                                                                                            • TCP Fairness
                                                                                                                                                                                                                            • Why is TCP fair
                                                                                                                                                                                                                            • Fairness (more)
                                                                                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                                                                                            • HTTP Modeling
                                                                                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                                                                                              3 Transport Layer 111Comp 361 Spring 2005

                                                                                                                                                                                                                              The Big Picture

                                                                                                                                                                                                                              3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                                                                              Slow Start (SS)

                                                                                                                                                                                                                              CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                              set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                              Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                              ACK receipt for previously unackeddata

                                                                                                                                                                                                                              CongestionAvoidance (CA)

                                                                                                                                                                                                                              CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                              Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                              Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                              SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                              Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                              Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                              Enter slow start

                                                                                                                                                                                                                              Duplicate ACK

                                                                                                                                                                                                                              SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                              CongWin and Threshold not changed

                                                                                                                                                                                                                              3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP throughput

                                                                                                                                                                                                                              Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                              Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                              3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP Futures

                                                                                                                                                                                                                              Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                              L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                              LRTTMSSsdot221

                                                                                                                                                                                                                              3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                              bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                              TCP connection 1

                                                                                                                                                                                                                              bottleneckrouter

                                                                                                                                                                                                                              capacity R

                                                                                                                                                                                                                              TCP connection 2

                                                                                                                                                                                                                              3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                              Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                              Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                              R

                                                                                                                                                                                                                              R

                                                                                                                                                                                                                              equal bandwidth share

                                                                                                                                                                                                                              Connection 1 throughput

                                                                                                                                                                                                                              Conn

                                                                                                                                                                                                                              ecti

                                                                                                                                                                                                                              on 2

                                                                                                                                                                                                                              thr

                                                                                                                                                                                                                              ough

                                                                                                                                                                                                                              p ut

                                                                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                              congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                              3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                              Fairness (more)Fairness and UDP

                                                                                                                                                                                                                              Multimedia apps often do not use TCP

                                                                                                                                                                                                                              do not want rate throttled by congestion control

                                                                                                                                                                                                                              Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                              Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                              Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                              new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                              3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                              Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                              Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                              modeling slow start

                                                                                                                                                                                                                              Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                              Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                              Fixed congestion window (1)

                                                                                                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                              latency = 2RTT + OR

                                                                                                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                              Fixed congestion window (2)

                                                                                                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                              Will show that the delay for one object is

                                                                                                                                                                                                                              RS

                                                                                                                                                                                                                              RSRTTP

                                                                                                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                              RTT

                                                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                                                              requestobject

                                                                                                                                                                                                                              first window= SR

                                                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                                                              delivered

                                                                                                                                                                                                                              time atclient

                                                                                                                                                                                                                              time atserver

                                                                                                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                              Server idles P=2 times

                                                                                                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                              RS

                                                                                                                                                                                                                              RSRTTPRTT

                                                                                                                                                                                                                              RO

                                                                                                                                                                                                                              RSRTT

                                                                                                                                                                                                                              RSRTT

                                                                                                                                                                                                                              RO

                                                                                                                                                                                                                              idleTimeRTTRO

                                                                                                                                                                                                                              P

                                                                                                                                                                                                                              kP

                                                                                                                                                                                                                              k

                                                                                                                                                                                                                              P

                                                                                                                                                                                                                              pp

                                                                                                                                                                                                                              )12(][2

                                                                                                                                                                                                                              ]2[2

                                                                                                                                                                                                                              2delay

                                                                                                                                                                                                                              1

                                                                                                                                                                                                                              1

                                                                                                                                                                                                                              1

                                                                                                                                                                                                                              minusminus+++=

                                                                                                                                                                                                                              minus+++=

                                                                                                                                                                                                                              ++=

                                                                                                                                                                                                                              minus

                                                                                                                                                                                                                              =

                                                                                                                                                                                                                              =

                                                                                                                                                                                                                              sum

                                                                                                                                                                                                                              sum

                                                                                                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                              RS k =⎥⎦

                                                                                                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                              +minus

                                                                                                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                              RSk

                                                                                                                                                                                                                              RTT

                                                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                                                              requestobject

                                                                                                                                                                                                                              first window= SR

                                                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                                                              delivered

                                                                                                                                                                                                                              time atclient

                                                                                                                                                                                                                              time atserver

                                                                                                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                              How do we calculate K

                                                                                                                                                                                                                              ⎥⎥⎤

                                                                                                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                                                                                                              +ge=

                                                                                                                                                                                                                              geminus=

                                                                                                                                                                                                                              ge+++=

                                                                                                                                                                                                                              ge+++=minus

                                                                                                                                                                                                                              minus

                                                                                                                                                                                                                              )1(log

                                                                                                                                                                                                                              )1(logmin

                                                                                                                                                                                                                              12min

                                                                                                                                                                                                                              222min222min

                                                                                                                                                                                                                              2

                                                                                                                                                                                                                              2

                                                                                                                                                                                                                              110

                                                                                                                                                                                                                              110

                                                                                                                                                                                                                              SO

                                                                                                                                                                                                                              SOkk

                                                                                                                                                                                                                              SOk

                                                                                                                                                                                                                              SOkOSSSkK

                                                                                                                                                                                                                              k

                                                                                                                                                                                                                              k

                                                                                                                                                                                                                              k

                                                                                                                                                                                                                              L

                                                                                                                                                                                                                              L

                                                                                                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                              02468

                                                                                                                                                                                                                              101214161820

                                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                                              persistent

                                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                                                                                              0

                                                                                                                                                                                                                              10

                                                                                                                                                                                                                              20

                                                                                                                                                                                                                              30

                                                                                                                                                                                                                              40

                                                                                                                                                                                                                              50

                                                                                                                                                                                                                              60

                                                                                                                                                                                                                              70

                                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                                              persistent

                                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                                                                                              UDPTCP

                                                                                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • Transport services and protocols
                                                                                                                                                                                                                              • Transport vs network layer
                                                                                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                                              • How demultiplexing works
                                                                                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                              • UDP more
                                                                                                                                                                                                                              • UDP checksum
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                                              • Incremental Improvements
                                                                                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                              • rdt21 discussion
                                                                                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                                                                                              • rdt30 sender
                                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                                              • Performance of rdt30
                                                                                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                                                                                              • Go-Back-N
                                                                                                                                                                                                                              • GBN Sender
                                                                                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                                                                                              • More on receiver
                                                                                                                                                                                                                              • GBN inaction
                                                                                                                                                                                                                              • Selective Repeat
                                                                                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                                                                                              • Selective repeat
                                                                                                                                                                                                                              • Selective repeat in action
                                                                                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                              • More TCP Details
                                                                                                                                                                                                                              • Even More TCP Details
                                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                              • Example RTT estimation
                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                                                                                              • TCP sender events
                                                                                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                              • More on Sender Policies
                                                                                                                                                                                                                              • Fast Retransmit
                                                                                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                                                                                              • Technical Issue
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • TCP Connection Management
                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                              • A few special cases
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                                                                                              • TCP AIMD
                                                                                                                                                                                                                              • TCP Slow Start
                                                                                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                                                                                              • The Big Picture
                                                                                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                                                                                              • TCP throughput
                                                                                                                                                                                                                              • TCP Futures
                                                                                                                                                                                                                              • TCP Fairness
                                                                                                                                                                                                                              • Why is TCP fair
                                                                                                                                                                                                                              • Fairness (more)
                                                                                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                                                                                              • HTTP Modeling
                                                                                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                                                                                3 Transport Layer 112Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP sender congestion controlEvent State TCP Sender Action Commentary

                                                                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                                                                Slow Start (SS)

                                                                                                                                                                                                                                CongWin = CongWin + MSS If (CongWin gt Threshold)

                                                                                                                                                                                                                                set state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                                Resulting in a doubling of CongWin every RTT

                                                                                                                                                                                                                                ACK receipt for previously unackeddata

                                                                                                                                                                                                                                CongestionAvoidance (CA)

                                                                                                                                                                                                                                CongWin = CongWin+MSS (MSSCongWin)

                                                                                                                                                                                                                                Additive increase resulting in increase of CongWin by 1 MSS every RTT

                                                                                                                                                                                                                                Loss event detected by triple duplicate ACK

                                                                                                                                                                                                                                SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo

                                                                                                                                                                                                                                Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS

                                                                                                                                                                                                                                Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo

                                                                                                                                                                                                                                Enter slow start

                                                                                                                                                                                                                                Duplicate ACK

                                                                                                                                                                                                                                SS or CA Increment duplicate ACK count for segment being acked

                                                                                                                                                                                                                                CongWin and Threshold not changed

                                                                                                                                                                                                                                3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP throughput

                                                                                                                                                                                                                                Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                                Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                                3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP Futures

                                                                                                                                                                                                                                Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                                L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                                LRTTMSSsdot221

                                                                                                                                                                                                                                3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                                bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                                TCP connection 1

                                                                                                                                                                                                                                bottleneckrouter

                                                                                                                                                                                                                                capacity R

                                                                                                                                                                                                                                TCP connection 2

                                                                                                                                                                                                                                3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                                Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                                Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                                R

                                                                                                                                                                                                                                R

                                                                                                                                                                                                                                equal bandwidth share

                                                                                                                                                                                                                                Connection 1 throughput

                                                                                                                                                                                                                                Conn

                                                                                                                                                                                                                                ecti

                                                                                                                                                                                                                                on 2

                                                                                                                                                                                                                                thr

                                                                                                                                                                                                                                ough

                                                                                                                                                                                                                                p ut

                                                                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                                Fairness (more)Fairness and UDP

                                                                                                                                                                                                                                Multimedia apps often do not use TCP

                                                                                                                                                                                                                                do not want rate throttled by congestion control

                                                                                                                                                                                                                                Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                                Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                                Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                                new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                                3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                modeling slow start

                                                                                                                                                                                                                                Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                                                                                                RS

                                                                                                                                                                                                                                RSRTTP

                                                                                                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                                                requestobject

                                                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                                                delivered

                                                                                                                                                                                                                                time atclient

                                                                                                                                                                                                                                time atserver

                                                                                                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                Server idles P=2 times

                                                                                                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                RS

                                                                                                                                                                                                                                RSRTTPRTT

                                                                                                                                                                                                                                RO

                                                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                                                RO

                                                                                                                                                                                                                                idleTimeRTTRO

                                                                                                                                                                                                                                P

                                                                                                                                                                                                                                kP

                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                P

                                                                                                                                                                                                                                pp

                                                                                                                                                                                                                                )12(][2

                                                                                                                                                                                                                                ]2[2

                                                                                                                                                                                                                                2delay

                                                                                                                                                                                                                                1

                                                                                                                                                                                                                                1

                                                                                                                                                                                                                                1

                                                                                                                                                                                                                                minusminus+++=

                                                                                                                                                                                                                                minus+++=

                                                                                                                                                                                                                                ++=

                                                                                                                                                                                                                                minus

                                                                                                                                                                                                                                =

                                                                                                                                                                                                                                =

                                                                                                                                                                                                                                sum

                                                                                                                                                                                                                                sum

                                                                                                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                RS k =⎥⎦

                                                                                                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                +minus

                                                                                                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                RSk

                                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                                                requestobject

                                                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                                                delivered

                                                                                                                                                                                                                                time atclient

                                                                                                                                                                                                                                time atserver

                                                                                                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                How do we calculate K

                                                                                                                                                                                                                                ⎥⎥⎤

                                                                                                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                                                                                                +ge=

                                                                                                                                                                                                                                geminus=

                                                                                                                                                                                                                                ge+++=

                                                                                                                                                                                                                                ge+++=minus

                                                                                                                                                                                                                                minus

                                                                                                                                                                                                                                )1(log

                                                                                                                                                                                                                                )1(logmin

                                                                                                                                                                                                                                12min

                                                                                                                                                                                                                                222min222min

                                                                                                                                                                                                                                2

                                                                                                                                                                                                                                2

                                                                                                                                                                                                                                110

                                                                                                                                                                                                                                110

                                                                                                                                                                                                                                SO

                                                                                                                                                                                                                                SOkk

                                                                                                                                                                                                                                SOk

                                                                                                                                                                                                                                SOkOSSSkK

                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                L

                                                                                                                                                                                                                                L

                                                                                                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                02468

                                                                                                                                                                                                                                101214161820

                                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                                                                                0

                                                                                                                                                                                                                                10

                                                                                                                                                                                                                                20

                                                                                                                                                                                                                                30

                                                                                                                                                                                                                                40

                                                                                                                                                                                                                                50

                                                                                                                                                                                                                                60

                                                                                                                                                                                                                                70

                                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                                                                                UDPTCP

                                                                                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                • UDP more
                                                                                                                                                                                                                                • UDP checksum
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                                                                                • rdt30 sender
                                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                                                                                • Go-Back-N
                                                                                                                                                                                                                                • GBN Sender
                                                                                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                                                                                • More on receiver
                                                                                                                                                                                                                                • GBN inaction
                                                                                                                                                                                                                                • Selective Repeat
                                                                                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                                                                                • Selective repeat
                                                                                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                • More TCP Details
                                                                                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                                                                                • TCP sender events
                                                                                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                                                                                • Technical Issue
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                • A few special cases
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                                                                                • TCP AIMD
                                                                                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                                                                                • The Big Picture
                                                                                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                                                                                • TCP throughput
                                                                                                                                                                                                                                • TCP Futures
                                                                                                                                                                                                                                • TCP Fairness
                                                                                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                                                                                • Fairness (more)
                                                                                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                                                                                  3 Transport Layer 113Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP throughput

                                                                                                                                                                                                                                  Whatrsquos the average throughput of TCP as a function of window size and RTT

                                                                                                                                                                                                                                  Ignore slow startLet W be the window size when loss occursWhen window is W throughput is WRTTJust after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT

                                                                                                                                                                                                                                  3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP Futures

                                                                                                                                                                                                                                  Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                                  L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                                  LRTTMSSsdot221

                                                                                                                                                                                                                                  3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                                  bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                                  TCP connection 1

                                                                                                                                                                                                                                  bottleneckrouter

                                                                                                                                                                                                                                  capacity R

                                                                                                                                                                                                                                  TCP connection 2

                                                                                                                                                                                                                                  3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                                  Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                                  Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                                  R

                                                                                                                                                                                                                                  R

                                                                                                                                                                                                                                  equal bandwidth share

                                                                                                                                                                                                                                  Connection 1 throughput

                                                                                                                                                                                                                                  Conn

                                                                                                                                                                                                                                  ecti

                                                                                                                                                                                                                                  on 2

                                                                                                                                                                                                                                  thr

                                                                                                                                                                                                                                  ough

                                                                                                                                                                                                                                  p ut

                                                                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                  congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                  3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                                  Fairness (more)Fairness and UDP

                                                                                                                                                                                                                                  Multimedia apps often do not use TCP

                                                                                                                                                                                                                                  do not want rate throttled by congestion control

                                                                                                                                                                                                                                  Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                                  Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                                  Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                                  new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                                  3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                  Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                  Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                  modeling slow start

                                                                                                                                                                                                                                  Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                  Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                  3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                  Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                  1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                  2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                  windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                  3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                  Fixed congestion window (1)

                                                                                                                                                                                                                                  First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                  first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                  latency = 2RTT + OR

                                                                                                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                                                                                                  RS

                                                                                                                                                                                                                                  RSRTTP

                                                                                                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                                                  delivered

                                                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                  Server idles P=2 times

                                                                                                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                  RS

                                                                                                                                                                                                                                  RSRTTPRTT

                                                                                                                                                                                                                                  RO

                                                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                                                  RO

                                                                                                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                                                                                                  P

                                                                                                                                                                                                                                  kP

                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                  P

                                                                                                                                                                                                                                  pp

                                                                                                                                                                                                                                  )12(][2

                                                                                                                                                                                                                                  ]2[2

                                                                                                                                                                                                                                  2delay

                                                                                                                                                                                                                                  1

                                                                                                                                                                                                                                  1

                                                                                                                                                                                                                                  1

                                                                                                                                                                                                                                  minusminus+++=

                                                                                                                                                                                                                                  minus+++=

                                                                                                                                                                                                                                  ++=

                                                                                                                                                                                                                                  minus

                                                                                                                                                                                                                                  =

                                                                                                                                                                                                                                  =

                                                                                                                                                                                                                                  sum

                                                                                                                                                                                                                                  sum

                                                                                                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                  +minus

                                                                                                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                  RSk

                                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                                                  delivered

                                                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                  How do we calculate K

                                                                                                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                                                                                                  +ge=

                                                                                                                                                                                                                                  geminus=

                                                                                                                                                                                                                                  ge+++=

                                                                                                                                                                                                                                  ge+++=minus

                                                                                                                                                                                                                                  minus

                                                                                                                                                                                                                                  )1(log

                                                                                                                                                                                                                                  )1(logmin

                                                                                                                                                                                                                                  12min

                                                                                                                                                                                                                                  222min222min

                                                                                                                                                                                                                                  2

                                                                                                                                                                                                                                  2

                                                                                                                                                                                                                                  110

                                                                                                                                                                                                                                  110

                                                                                                                                                                                                                                  SO

                                                                                                                                                                                                                                  SOkk

                                                                                                                                                                                                                                  SOk

                                                                                                                                                                                                                                  SOkOSSSkK

                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                  L

                                                                                                                                                                                                                                  L

                                                                                                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                  02468

                                                                                                                                                                                                                                  101214161820

                                                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                                                  persistent

                                                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                                                                                                  0

                                                                                                                                                                                                                                  10

                                                                                                                                                                                                                                  20

                                                                                                                                                                                                                                  30

                                                                                                                                                                                                                                  40

                                                                                                                                                                                                                                  50

                                                                                                                                                                                                                                  60

                                                                                                                                                                                                                                  70

                                                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                                                  persistent

                                                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                                                                                  UDPTCP

                                                                                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                  • UDP more
                                                                                                                                                                                                                                  • UDP checksum
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                                                                                  • GBN Sender
                                                                                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                                                                                  • More on receiver
                                                                                                                                                                                                                                  • GBN inaction
                                                                                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                                                                                  • Selective repeat
                                                                                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                  • More TCP Details
                                                                                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                                                                                  • TCP sender events
                                                                                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                                                                                  • Technical Issue
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                  • A few special cases
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                                                                                  • The Big Picture
                                                                                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                                                                                  • TCP throughput
                                                                                                                                                                                                                                  • TCP Futures
                                                                                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                                                                                    3 Transport Layer 114Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP Futures

                                                                                                                                                                                                                                    Example 1500 byte segments 100ms RTT want 10 Gbps throughputRequires window size W = 83333 in-flight segmentsThroughput in terms of loss rate

                                                                                                                                                                                                                                    L = 210-10 WowNew versions of TCP for high-speed needed

                                                                                                                                                                                                                                    LRTTMSSsdot221

                                                                                                                                                                                                                                    3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                                    bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                                    TCP connection 1

                                                                                                                                                                                                                                    bottleneckrouter

                                                                                                                                                                                                                                    capacity R

                                                                                                                                                                                                                                    TCP connection 2

                                                                                                                                                                                                                                    3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                                    Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                                    Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                                    R

                                                                                                                                                                                                                                    R

                                                                                                                                                                                                                                    equal bandwidth share

                                                                                                                                                                                                                                    Connection 1 throughput

                                                                                                                                                                                                                                    Conn

                                                                                                                                                                                                                                    ecti

                                                                                                                                                                                                                                    on 2

                                                                                                                                                                                                                                    thr

                                                                                                                                                                                                                                    ough

                                                                                                                                                                                                                                    p ut

                                                                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                    congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                    3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                                    Fairness (more)Fairness and UDP

                                                                                                                                                                                                                                    Multimedia apps often do not use TCP

                                                                                                                                                                                                                                    do not want rate throttled by congestion control

                                                                                                                                                                                                                                    Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                                    Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                                    Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                                    new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                                    3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                    Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                    Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                    modeling slow start

                                                                                                                                                                                                                                    Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                    Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                    3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                    Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                    1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                    2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                    windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                    3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                    Fixed congestion window (1)

                                                                                                                                                                                                                                    First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                    first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                    latency = 2RTT + OR

                                                                                                                                                                                                                                    3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                    Fixed congestion window (2)

                                                                                                                                                                                                                                    Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                    latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                                                                                                    RS

                                                                                                                                                                                                                                    RSRTTP

                                                                                                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                                                    delivered

                                                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                    Server idles P=2 times

                                                                                                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                    RS

                                                                                                                                                                                                                                    RSRTTPRTT

                                                                                                                                                                                                                                    RO

                                                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                                                    RO

                                                                                                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                                                                                                    P

                                                                                                                                                                                                                                    kP

                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                    P

                                                                                                                                                                                                                                    pp

                                                                                                                                                                                                                                    )12(][2

                                                                                                                                                                                                                                    ]2[2

                                                                                                                                                                                                                                    2delay

                                                                                                                                                                                                                                    1

                                                                                                                                                                                                                                    1

                                                                                                                                                                                                                                    1

                                                                                                                                                                                                                                    minusminus+++=

                                                                                                                                                                                                                                    minus+++=

                                                                                                                                                                                                                                    ++=

                                                                                                                                                                                                                                    minus

                                                                                                                                                                                                                                    =

                                                                                                                                                                                                                                    =

                                                                                                                                                                                                                                    sum

                                                                                                                                                                                                                                    sum

                                                                                                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                    +minus

                                                                                                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                    RSk

                                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                                                    delivered

                                                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                    How do we calculate K

                                                                                                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                                                                                                    +ge=

                                                                                                                                                                                                                                    geminus=

                                                                                                                                                                                                                                    ge+++=

                                                                                                                                                                                                                                    ge+++=minus

                                                                                                                                                                                                                                    minus

                                                                                                                                                                                                                                    )1(log

                                                                                                                                                                                                                                    )1(logmin

                                                                                                                                                                                                                                    12min

                                                                                                                                                                                                                                    222min222min

                                                                                                                                                                                                                                    2

                                                                                                                                                                                                                                    2

                                                                                                                                                                                                                                    110

                                                                                                                                                                                                                                    110

                                                                                                                                                                                                                                    SO

                                                                                                                                                                                                                                    SOkk

                                                                                                                                                                                                                                    SOk

                                                                                                                                                                                                                                    SOkOSSSkK

                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                    L

                                                                                                                                                                                                                                    L

                                                                                                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                    02468

                                                                                                                                                                                                                                    101214161820

                                                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                                                    persistent

                                                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                                                                                                    0

                                                                                                                                                                                                                                    10

                                                                                                                                                                                                                                    20

                                                                                                                                                                                                                                    30

                                                                                                                                                                                                                                    40

                                                                                                                                                                                                                                    50

                                                                                                                                                                                                                                    60

                                                                                                                                                                                                                                    70

                                                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                                                    persistent

                                                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                                                                                                    UDPTCP

                                                                                                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • Transport services and protocols
                                                                                                                                                                                                                                    • Transport vs network layer
                                                                                                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                                                    • How demultiplexing works
                                                                                                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                    • UDP more
                                                                                                                                                                                                                                    • UDP checksum
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                                                    • Incremental Improvements
                                                                                                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                    • rdt21 discussion
                                                                                                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                                                                                                    • rdt30 sender
                                                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                                                    • Performance of rdt30
                                                                                                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                                                                                                    • Go-Back-N
                                                                                                                                                                                                                                    • GBN Sender
                                                                                                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                                                                                                    • More on receiver
                                                                                                                                                                                                                                    • GBN inaction
                                                                                                                                                                                                                                    • Selective Repeat
                                                                                                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                                                                                                    • Selective repeat
                                                                                                                                                                                                                                    • Selective repeat in action
                                                                                                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                    • More TCP Details
                                                                                                                                                                                                                                    • Even More TCP Details
                                                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                    • Example RTT estimation
                                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                                                                                                    • TCP sender events
                                                                                                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                    • More on Sender Policies
                                                                                                                                                                                                                                    • Fast Retransmit
                                                                                                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                                                                                                    • Technical Issue
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • TCP Connection Management
                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                    • A few special cases
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                                                                                                    • TCP AIMD
                                                                                                                                                                                                                                    • TCP Slow Start
                                                                                                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                                                                                                    • The Big Picture
                                                                                                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                                                                                                    • TCP throughput
                                                                                                                                                                                                                                    • TCP Futures
                                                                                                                                                                                                                                    • TCP Fairness
                                                                                                                                                                                                                                    • Why is TCP fair
                                                                                                                                                                                                                                    • Fairness (more)
                                                                                                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                                                                                                    • HTTP Modeling
                                                                                                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                                                                                                      3 Transport Layer 115Comp 361 Spring 2005

                                                                                                                                                                                                                                      TCP FairnessFairness goal if K TCP sessions share same

                                                                                                                                                                                                                                      bottleneck link of bandwidth R each should have average rate of RK

                                                                                                                                                                                                                                      TCP connection 1

                                                                                                                                                                                                                                      bottleneckrouter

                                                                                                                                                                                                                                      capacity R

                                                                                                                                                                                                                                      TCP connection 2

                                                                                                                                                                                                                                      3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                                      Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                                      Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                                      R

                                                                                                                                                                                                                                      R

                                                                                                                                                                                                                                      equal bandwidth share

                                                                                                                                                                                                                                      Connection 1 throughput

                                                                                                                                                                                                                                      Conn

                                                                                                                                                                                                                                      ecti

                                                                                                                                                                                                                                      on 2

                                                                                                                                                                                                                                      thr

                                                                                                                                                                                                                                      ough

                                                                                                                                                                                                                                      p ut

                                                                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                      congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                      3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                                      Fairness (more)Fairness and UDP

                                                                                                                                                                                                                                      Multimedia apps often do not use TCP

                                                                                                                                                                                                                                      do not want rate throttled by congestion control

                                                                                                                                                                                                                                      Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                                      Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                                      Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                                      new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                                      3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                      TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                      Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                      Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                      modeling slow start

                                                                                                                                                                                                                                      Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                      Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                      3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                      Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                      1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                      2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                      windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                      3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                      Fixed congestion window (1)

                                                                                                                                                                                                                                      First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                      first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                      latency = 2RTT + OR

                                                                                                                                                                                                                                      3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                      Fixed congestion window (2)

                                                                                                                                                                                                                                      Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                      latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                      3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                      TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                      Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                      Will show that the delay for one object is

                                                                                                                                                                                                                                      RS

                                                                                                                                                                                                                                      RSRTTP

                                                                                                                                                                                                                                      RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                      ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                      where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                      - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                      - and K is the number of windows that cover the object

                                                                                                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                                                      delivered

                                                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                      Server idles P=2 times

                                                                                                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                      RS

                                                                                                                                                                                                                                      RSRTTPRTT

                                                                                                                                                                                                                                      RO

                                                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                                                      RO

                                                                                                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                                                                                                      P

                                                                                                                                                                                                                                      kP

                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                      P

                                                                                                                                                                                                                                      pp

                                                                                                                                                                                                                                      )12(][2

                                                                                                                                                                                                                                      ]2[2

                                                                                                                                                                                                                                      2delay

                                                                                                                                                                                                                                      1

                                                                                                                                                                                                                                      1

                                                                                                                                                                                                                                      1

                                                                                                                                                                                                                                      minusminus+++=

                                                                                                                                                                                                                                      minus+++=

                                                                                                                                                                                                                                      ++=

                                                                                                                                                                                                                                      minus

                                                                                                                                                                                                                                      =

                                                                                                                                                                                                                                      =

                                                                                                                                                                                                                                      sum

                                                                                                                                                                                                                                      sum

                                                                                                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                      +minus

                                                                                                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                      RSk

                                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                                                      delivered

                                                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                      How do we calculate K

                                                                                                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                                                                                                      +ge=

                                                                                                                                                                                                                                      geminus=

                                                                                                                                                                                                                                      ge+++=

                                                                                                                                                                                                                                      ge+++=minus

                                                                                                                                                                                                                                      minus

                                                                                                                                                                                                                                      )1(log

                                                                                                                                                                                                                                      )1(logmin

                                                                                                                                                                                                                                      12min

                                                                                                                                                                                                                                      222min222min

                                                                                                                                                                                                                                      2

                                                                                                                                                                                                                                      2

                                                                                                                                                                                                                                      110

                                                                                                                                                                                                                                      110

                                                                                                                                                                                                                                      SO

                                                                                                                                                                                                                                      SOkk

                                                                                                                                                                                                                                      SOk

                                                                                                                                                                                                                                      SOkOSSSkK

                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                      L

                                                                                                                                                                                                                                      L

                                                                                                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                      02468

                                                                                                                                                                                                                                      101214161820

                                                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                                                      persistent

                                                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                                                                                                      0

                                                                                                                                                                                                                                      10

                                                                                                                                                                                                                                      20

                                                                                                                                                                                                                                      30

                                                                                                                                                                                                                                      40

                                                                                                                                                                                                                                      50

                                                                                                                                                                                                                                      60

                                                                                                                                                                                                                                      70

                                                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                                                      persistent

                                                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                                                                                                      UDPTCP

                                                                                                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • Transport services and protocols
                                                                                                                                                                                                                                      • Transport vs network layer
                                                                                                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                                                      • How demultiplexing works
                                                                                                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                      • UDP more
                                                                                                                                                                                                                                      • UDP checksum
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                                                      • Incremental Improvements
                                                                                                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                      • rdt21 discussion
                                                                                                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                                                                                                      • rdt30 sender
                                                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                                                      • Performance of rdt30
                                                                                                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                                                                                                      • Go-Back-N
                                                                                                                                                                                                                                      • GBN Sender
                                                                                                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                                                                                                      • More on receiver
                                                                                                                                                                                                                                      • GBN inaction
                                                                                                                                                                                                                                      • Selective Repeat
                                                                                                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                                                                                                      • Selective repeat
                                                                                                                                                                                                                                      • Selective repeat in action
                                                                                                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                      • More TCP Details
                                                                                                                                                                                                                                      • Even More TCP Details
                                                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                      • Example RTT estimation
                                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                                                                                                      • TCP sender events
                                                                                                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                      • More on Sender Policies
                                                                                                                                                                                                                                      • Fast Retransmit
                                                                                                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                                                                                                      • Technical Issue
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • TCP Connection Management
                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                      • A few special cases
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                                                                                                      • TCP AIMD
                                                                                                                                                                                                                                      • TCP Slow Start
                                                                                                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                                                                                                      • The Big Picture
                                                                                                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                                                                                                      • TCP throughput
                                                                                                                                                                                                                                      • TCP Futures
                                                                                                                                                                                                                                      • TCP Fairness
                                                                                                                                                                                                                                      • Why is TCP fair
                                                                                                                                                                                                                                      • Fairness (more)
                                                                                                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                                                                                                      • HTTP Modeling
                                                                                                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                                                                                                        3 Transport Layer 116Comp 361 Spring 2005

                                                                                                                                                                                                                                        Why is TCP fairTwo competing sessions

                                                                                                                                                                                                                                        Additive increase gives slope of 1 as throughout increasesmultiplicative decrease decreases throughput proportionally

                                                                                                                                                                                                                                        R

                                                                                                                                                                                                                                        R

                                                                                                                                                                                                                                        equal bandwidth share

                                                                                                                                                                                                                                        Connection 1 throughput

                                                                                                                                                                                                                                        Conn

                                                                                                                                                                                                                                        ecti

                                                                                                                                                                                                                                        on 2

                                                                                                                                                                                                                                        thr

                                                                                                                                                                                                                                        ough

                                                                                                                                                                                                                                        p ut

                                                                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                        congestion avoidance additive increaseloss decrease window by factor of 2

                                                                                                                                                                                                                                        3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                                        Fairness (more)Fairness and UDP

                                                                                                                                                                                                                                        Multimedia apps often do not use TCP

                                                                                                                                                                                                                                        do not want rate throttled by congestion control

                                                                                                                                                                                                                                        Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                                        Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                                        Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                                        new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                                        3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                        TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                        Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                        Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                        modeling slow start

                                                                                                                                                                                                                                        Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                        Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                        3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                        Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                        1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                        2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                        windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                        3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                        Fixed congestion window (1)

                                                                                                                                                                                                                                        First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                        first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                        latency = 2RTT + OR

                                                                                                                                                                                                                                        3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                        Fixed congestion window (2)

                                                                                                                                                                                                                                        Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                        latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                        3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                        TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                        Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                        Will show that the delay for one object is

                                                                                                                                                                                                                                        RS

                                                                                                                                                                                                                                        RSRTTP

                                                                                                                                                                                                                                        RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                        ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                        where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                        - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                        - and K is the number of windows that cover the object

                                                                                                                                                                                                                                        3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                        TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                        RTT

                                                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                                                        delivered

                                                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                                                        Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                        Server idles P=2 times

                                                                                                                                                                                                                                        Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                        Server idles P = minK-1Q times

                                                                                                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                        RS

                                                                                                                                                                                                                                        RSRTTPRTT

                                                                                                                                                                                                                                        RO

                                                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                                                        RO

                                                                                                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                                                                                                        P

                                                                                                                                                                                                                                        kP

                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                        P

                                                                                                                                                                                                                                        pp

                                                                                                                                                                                                                                        )12(][2

                                                                                                                                                                                                                                        ]2[2

                                                                                                                                                                                                                                        2delay

                                                                                                                                                                                                                                        1

                                                                                                                                                                                                                                        1

                                                                                                                                                                                                                                        1

                                                                                                                                                                                                                                        minusminus+++=

                                                                                                                                                                                                                                        minus+++=

                                                                                                                                                                                                                                        ++=

                                                                                                                                                                                                                                        minus

                                                                                                                                                                                                                                        =

                                                                                                                                                                                                                                        =

                                                                                                                                                                                                                                        sum

                                                                                                                                                                                                                                        sum

                                                                                                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                        +minus

                                                                                                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                        RSk

                                                                                                                                                                                                                                        RTT

                                                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                                                        delivered

                                                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                        How do we calculate K

                                                                                                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                                                                                                        +ge=

                                                                                                                                                                                                                                        geminus=

                                                                                                                                                                                                                                        ge+++=

                                                                                                                                                                                                                                        ge+++=minus

                                                                                                                                                                                                                                        minus

                                                                                                                                                                                                                                        )1(log

                                                                                                                                                                                                                                        )1(logmin

                                                                                                                                                                                                                                        12min

                                                                                                                                                                                                                                        222min222min

                                                                                                                                                                                                                                        2

                                                                                                                                                                                                                                        2

                                                                                                                                                                                                                                        110

                                                                                                                                                                                                                                        110

                                                                                                                                                                                                                                        SO

                                                                                                                                                                                                                                        SOkk

                                                                                                                                                                                                                                        SOk

                                                                                                                                                                                                                                        SOkOSSSkK

                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                        L

                                                                                                                                                                                                                                        L

                                                                                                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                        02468

                                                                                                                                                                                                                                        101214161820

                                                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                                                        persistent

                                                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                                                                                                        0

                                                                                                                                                                                                                                        10

                                                                                                                                                                                                                                        20

                                                                                                                                                                                                                                        30

                                                                                                                                                                                                                                        40

                                                                                                                                                                                                                                        50

                                                                                                                                                                                                                                        60

                                                                                                                                                                                                                                        70

                                                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                                                        persistent

                                                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                                                                                                        UDPTCP

                                                                                                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • Transport services and protocols
                                                                                                                                                                                                                                        • Transport vs network layer
                                                                                                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                                                        • How demultiplexing works
                                                                                                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                        • UDP more
                                                                                                                                                                                                                                        • UDP checksum
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                                                        • Incremental Improvements
                                                                                                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                        • rdt21 discussion
                                                                                                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                                                                                                        • rdt30 sender
                                                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                                                        • Performance of rdt30
                                                                                                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                                                                                                        • Go-Back-N
                                                                                                                                                                                                                                        • GBN Sender
                                                                                                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                                                                                                        • More on receiver
                                                                                                                                                                                                                                        • GBN inaction
                                                                                                                                                                                                                                        • Selective Repeat
                                                                                                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                                                                                                        • Selective repeat
                                                                                                                                                                                                                                        • Selective repeat in action
                                                                                                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                        • More TCP Details
                                                                                                                                                                                                                                        • Even More TCP Details
                                                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                        • Example RTT estimation
                                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                                                                                                        • TCP sender events
                                                                                                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                        • More on Sender Policies
                                                                                                                                                                                                                                        • Fast Retransmit
                                                                                                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                                                                                                        • Technical Issue
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • TCP Connection Management
                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                        • A few special cases
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                                                                                                        • TCP AIMD
                                                                                                                                                                                                                                        • TCP Slow Start
                                                                                                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                                                                                                        • The Big Picture
                                                                                                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                                                                                                        • TCP throughput
                                                                                                                                                                                                                                        • TCP Futures
                                                                                                                                                                                                                                        • TCP Fairness
                                                                                                                                                                                                                                        • Why is TCP fair
                                                                                                                                                                                                                                        • Fairness (more)
                                                                                                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                                                                                                        • HTTP Modeling
                                                                                                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                                                                                                          3 Transport Layer 117Comp 361 Spring 2005

                                                                                                                                                                                                                                          Fairness (more)Fairness and UDP

                                                                                                                                                                                                                                          Multimedia apps often do not use TCP

                                                                                                                                                                                                                                          do not want rate throttled by congestion control

                                                                                                                                                                                                                                          Instead use UDPpump audiovideo at constant rate tolerate packet loss

                                                                                                                                                                                                                                          Current Research area How to keep UDP from congesting the internet

                                                                                                                                                                                                                                          Fairness and parallel TCP connectionsnothing prevents app from opening parallel cnctionsbetween 2 hostsWeb browsers do this Example link of rate R supporting 9 cnctions

                                                                                                                                                                                                                                          new app asks for 1 TCP gets rate R10new app asks for 11 TCPs gets R2

                                                                                                                                                                                                                                          3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                          TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                          Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                          Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                          modeling slow start

                                                                                                                                                                                                                                          Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                          Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                          3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                          Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                          1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                          2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                          windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                          3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                          Fixed congestion window (1)

                                                                                                                                                                                                                                          First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                          first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                          latency = 2RTT + OR

                                                                                                                                                                                                                                          3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                          Fixed congestion window (2)

                                                                                                                                                                                                                                          Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                          latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                          3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                          TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                          Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                          Will show that the delay for one object is

                                                                                                                                                                                                                                          RS

                                                                                                                                                                                                                                          RSRTTP

                                                                                                                                                                                                                                          RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                          ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                          where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                          - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                          - and K is the number of windows that cover the object

                                                                                                                                                                                                                                          3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                          TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                          RTT

                                                                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                                                                          requestobject

                                                                                                                                                                                                                                          first window= SR

                                                                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                                                                          delivered

                                                                                                                                                                                                                                          time atclient

                                                                                                                                                                                                                                          time atserver

                                                                                                                                                                                                                                          Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                          Server idles P=2 times

                                                                                                                                                                                                                                          Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                          Server idles P = minK-1Q times

                                                                                                                                                                                                                                          3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                          TCP Latency Modeling (3)

                                                                                                                                                                                                                                          ementacknowledg receivesserver until

                                                                                                                                                                                                                                          segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                          RS

                                                                                                                                                                                                                                          RSRTTPRTT

                                                                                                                                                                                                                                          RO

                                                                                                                                                                                                                                          RSRTT

                                                                                                                                                                                                                                          RSRTT

                                                                                                                                                                                                                                          RO

                                                                                                                                                                                                                                          idleTimeRTTRO

                                                                                                                                                                                                                                          P

                                                                                                                                                                                                                                          kP

                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                          P

                                                                                                                                                                                                                                          pp

                                                                                                                                                                                                                                          )12(][2

                                                                                                                                                                                                                                          ]2[2

                                                                                                                                                                                                                                          2delay

                                                                                                                                                                                                                                          1

                                                                                                                                                                                                                                          1

                                                                                                                                                                                                                                          1

                                                                                                                                                                                                                                          minusminus+++=

                                                                                                                                                                                                                                          minus+++=

                                                                                                                                                                                                                                          ++=

                                                                                                                                                                                                                                          minus

                                                                                                                                                                                                                                          =

                                                                                                                                                                                                                                          =

                                                                                                                                                                                                                                          sum

                                                                                                                                                                                                                                          sum

                                                                                                                                                                                                                                          th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                          RS k =⎥⎦

                                                                                                                                                                                                                                          ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                          +minus

                                                                                                                                                                                                                                          window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                          RSk

                                                                                                                                                                                                                                          RTT

                                                                                                                                                                                                                                          initiate TCPconnection

                                                                                                                                                                                                                                          requestobject

                                                                                                                                                                                                                                          first window= SR

                                                                                                                                                                                                                                          second window= 2SR

                                                                                                                                                                                                                                          third window= 4SR

                                                                                                                                                                                                                                          fourth window= 8SR

                                                                                                                                                                                                                                          completetransmissionobject

                                                                                                                                                                                                                                          delivered

                                                                                                                                                                                                                                          time atclient

                                                                                                                                                                                                                                          time atserver

                                                                                                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                          How do we calculate K

                                                                                                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                                                                                                          +ge=

                                                                                                                                                                                                                                          geminus=

                                                                                                                                                                                                                                          ge+++=

                                                                                                                                                                                                                                          ge+++=minus

                                                                                                                                                                                                                                          minus

                                                                                                                                                                                                                                          )1(log

                                                                                                                                                                                                                                          )1(logmin

                                                                                                                                                                                                                                          12min

                                                                                                                                                                                                                                          222min222min

                                                                                                                                                                                                                                          2

                                                                                                                                                                                                                                          2

                                                                                                                                                                                                                                          110

                                                                                                                                                                                                                                          110

                                                                                                                                                                                                                                          SO

                                                                                                                                                                                                                                          SOkk

                                                                                                                                                                                                                                          SOk

                                                                                                                                                                                                                                          SOkOSSSkK

                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                          L

                                                                                                                                                                                                                                          L

                                                                                                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                          02468

                                                                                                                                                                                                                                          101214161820

                                                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                                                          persistent

                                                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                                                                                                          0

                                                                                                                                                                                                                                          10

                                                                                                                                                                                                                                          20

                                                                                                                                                                                                                                          30

                                                                                                                                                                                                                                          40

                                                                                                                                                                                                                                          50

                                                                                                                                                                                                                                          60

                                                                                                                                                                                                                                          70

                                                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                                                          persistent

                                                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                                                                                                          UDPTCP

                                                                                                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • Transport services and protocols
                                                                                                                                                                                                                                          • Transport vs network layer
                                                                                                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                                                          • How demultiplexing works
                                                                                                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                          • UDP more
                                                                                                                                                                                                                                          • UDP checksum
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                                                          • Incremental Improvements
                                                                                                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                          • rdt21 discussion
                                                                                                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                                                                                                          • rdt30 sender
                                                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                                                          • Performance of rdt30
                                                                                                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                                                                                                          • Go-Back-N
                                                                                                                                                                                                                                          • GBN Sender
                                                                                                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                                                                                                          • More on receiver
                                                                                                                                                                                                                                          • GBN inaction
                                                                                                                                                                                                                                          • Selective Repeat
                                                                                                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                                                                                                          • Selective repeat
                                                                                                                                                                                                                                          • Selective repeat in action
                                                                                                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                          • More TCP Details
                                                                                                                                                                                                                                          • Even More TCP Details
                                                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                          • Example RTT estimation
                                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                                                                                                          • TCP sender events
                                                                                                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                          • More on Sender Policies
                                                                                                                                                                                                                                          • Fast Retransmit
                                                                                                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                                                                                                          • Technical Issue
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • TCP Connection Management
                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                          • A few special cases
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                                                                                                          • TCP AIMD
                                                                                                                                                                                                                                          • TCP Slow Start
                                                                                                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                                                                                                          • The Big Picture
                                                                                                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                                                                                                          • TCP throughput
                                                                                                                                                                                                                                          • TCP Futures
                                                                                                                                                                                                                                          • TCP Fairness
                                                                                                                                                                                                                                          • Why is TCP fair
                                                                                                                                                                                                                                          • Fairness (more)
                                                                                                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                                                                                                          • HTTP Modeling
                                                                                                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                                                                                                            3 Transport Layer 118Comp 361 Spring 2005

                                                                                                                                                                                                                                            TCP Latency ModelingNotation assumptions

                                                                                                                                                                                                                                            Assume one link between client and server of rate RS MSS (bits)O object size (bits)no retransmissions (no loss no corruption)

                                                                                                                                                                                                                                            Window sizeFirst assume fixed congestion window W segmentsThen dynamic window

                                                                                                                                                                                                                                            modeling slow start

                                                                                                                                                                                                                                            Q How long does it take to completely receive an object from a Web server after sending a request This is known as the latency of the (request for the) object

                                                                                                                                                                                                                                            Ignoring congestion delay is influenced byTCP connection establishmentdata transmission delayslow start

                                                                                                                                                                                                                                            3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                            Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                            1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                            2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                            windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                            3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                            Fixed congestion window (1)

                                                                                                                                                                                                                                            First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                            first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                            latency = 2RTT + OR

                                                                                                                                                                                                                                            3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                            Fixed congestion window (2)

                                                                                                                                                                                                                                            Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                            latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                            3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                            TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                            Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                            Will show that the delay for one object is

                                                                                                                                                                                                                                            RS

                                                                                                                                                                                                                                            RSRTTP

                                                                                                                                                                                                                                            RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                            ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                            where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                            - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                            - and K is the number of windows that cover the object

                                                                                                                                                                                                                                            3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                            TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                            RTT

                                                                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                                                                            requestobject

                                                                                                                                                                                                                                            first window= SR

                                                                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                                                                            delivered

                                                                                                                                                                                                                                            time atclient

                                                                                                                                                                                                                                            time atserver

                                                                                                                                                                                                                                            Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                            Server idles P=2 times

                                                                                                                                                                                                                                            Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                            Server idles P = minK-1Q times

                                                                                                                                                                                                                                            3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                            TCP Latency Modeling (3)

                                                                                                                                                                                                                                            ementacknowledg receivesserver until

                                                                                                                                                                                                                                            segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                            RS

                                                                                                                                                                                                                                            RSRTTPRTT

                                                                                                                                                                                                                                            RO

                                                                                                                                                                                                                                            RSRTT

                                                                                                                                                                                                                                            RSRTT

                                                                                                                                                                                                                                            RO

                                                                                                                                                                                                                                            idleTimeRTTRO

                                                                                                                                                                                                                                            P

                                                                                                                                                                                                                                            kP

                                                                                                                                                                                                                                            k

                                                                                                                                                                                                                                            P

                                                                                                                                                                                                                                            pp

                                                                                                                                                                                                                                            )12(][2

                                                                                                                                                                                                                                            ]2[2

                                                                                                                                                                                                                                            2delay

                                                                                                                                                                                                                                            1

                                                                                                                                                                                                                                            1

                                                                                                                                                                                                                                            1

                                                                                                                                                                                                                                            minusminus+++=

                                                                                                                                                                                                                                            minus+++=

                                                                                                                                                                                                                                            ++=

                                                                                                                                                                                                                                            minus

                                                                                                                                                                                                                                            =

                                                                                                                                                                                                                                            =

                                                                                                                                                                                                                                            sum

                                                                                                                                                                                                                                            sum

                                                                                                                                                                                                                                            th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                            RS k =⎥⎦

                                                                                                                                                                                                                                            ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                            +minus

                                                                                                                                                                                                                                            window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                            RSk

                                                                                                                                                                                                                                            RTT

                                                                                                                                                                                                                                            initiate TCPconnection

                                                                                                                                                                                                                                            requestobject

                                                                                                                                                                                                                                            first window= SR

                                                                                                                                                                                                                                            second window= 2SR

                                                                                                                                                                                                                                            third window= 4SR

                                                                                                                                                                                                                                            fourth window= 8SR

                                                                                                                                                                                                                                            completetransmissionobject

                                                                                                                                                                                                                                            delivered

                                                                                                                                                                                                                                            time atclient

                                                                                                                                                                                                                                            time atserver

                                                                                                                                                                                                                                            3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                            TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                            How do we calculate K

                                                                                                                                                                                                                                            ⎥⎥⎤

                                                                                                                                                                                                                                            ⎢⎢⎡ +=

                                                                                                                                                                                                                                            +ge=

                                                                                                                                                                                                                                            geminus=

                                                                                                                                                                                                                                            ge+++=

                                                                                                                                                                                                                                            ge+++=minus

                                                                                                                                                                                                                                            minus

                                                                                                                                                                                                                                            )1(log

                                                                                                                                                                                                                                            )1(logmin

                                                                                                                                                                                                                                            12min

                                                                                                                                                                                                                                            222min222min

                                                                                                                                                                                                                                            2

                                                                                                                                                                                                                                            2

                                                                                                                                                                                                                                            110

                                                                                                                                                                                                                                            110

                                                                                                                                                                                                                                            SO

                                                                                                                                                                                                                                            SOkk

                                                                                                                                                                                                                                            SOk

                                                                                                                                                                                                                                            SOkOSSSkK

                                                                                                                                                                                                                                            k

                                                                                                                                                                                                                                            k

                                                                                                                                                                                                                                            k

                                                                                                                                                                                                                                            L

                                                                                                                                                                                                                                            L

                                                                                                                                                                                                                                            Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                            02468

                                                                                                                                                                                                                                            101214161820

                                                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                                                            persistent

                                                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                                                                                                            0

                                                                                                                                                                                                                                            10

                                                                                                                                                                                                                                            20

                                                                                                                                                                                                                                            30

                                                                                                                                                                                                                                            40

                                                                                                                                                                                                                                            50

                                                                                                                                                                                                                                            60

                                                                                                                                                                                                                                            70

                                                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                                                            persistent

                                                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                                                                                                            UDPTCP

                                                                                                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • Transport services and protocols
                                                                                                                                                                                                                                            • Transport vs network layer
                                                                                                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                                                            • How demultiplexing works
                                                                                                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                            • UDP more
                                                                                                                                                                                                                                            • UDP checksum
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                                                            • Incremental Improvements
                                                                                                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                            • rdt21 discussion
                                                                                                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                                                                                                            • rdt30 sender
                                                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                                                            • Performance of rdt30
                                                                                                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                                                                                                            • Go-Back-N
                                                                                                                                                                                                                                            • GBN Sender
                                                                                                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                                                                                                            • More on receiver
                                                                                                                                                                                                                                            • GBN inaction
                                                                                                                                                                                                                                            • Selective Repeat
                                                                                                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                                                                                                            • Selective repeat
                                                                                                                                                                                                                                            • Selective repeat in action
                                                                                                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                            • More TCP Details
                                                                                                                                                                                                                                            • Even More TCP Details
                                                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                            • Example RTT estimation
                                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                                                                                                            • TCP sender events
                                                                                                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                            • More on Sender Policies
                                                                                                                                                                                                                                            • Fast Retransmit
                                                                                                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                                                                                                            • Technical Issue
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • TCP Connection Management
                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                            • A few special cases
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                                                                                                            • TCP AIMD
                                                                                                                                                                                                                                            • TCP Slow Start
                                                                                                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                                                                                                            • The Big Picture
                                                                                                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                                                                                                            • TCP throughput
                                                                                                                                                                                                                                            • TCP Futures
                                                                                                                                                                                                                                            • TCP Fairness
                                                                                                                                                                                                                                            • Why is TCP fair
                                                                                                                                                                                                                                            • Fairness (more)
                                                                                                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                                                                                                            • HTTP Modeling
                                                                                                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                                                                                                              3 Transport Layer 119Comp 361 Spring 2005

                                                                                                                                                                                                                                              Fixed Congestion Window (W)Two cases

                                                                                                                                                                                                                                              1 WSR gt RTT + SR ACK for first segment in window returns before

                                                                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR

                                                                                                                                                                                                                                              2 WSR lt RTT + SR ACK for first segment in window returns after

                                                                                                                                                                                                                                              windowrsquos worth of data sentLatency = 2RTT + OR + (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                              3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                              Fixed congestion window (1)

                                                                                                                                                                                                                                              First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                              first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                              latency = 2RTT + OR

                                                                                                                                                                                                                                              3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                              Fixed congestion window (2)

                                                                                                                                                                                                                                              Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                              latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                              3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                              TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                              Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                              Will show that the delay for one object is

                                                                                                                                                                                                                                              RS

                                                                                                                                                                                                                                              RSRTTP

                                                                                                                                                                                                                                              RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                              ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                              where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                              - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                              - and K is the number of windows that cover the object

                                                                                                                                                                                                                                              3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                              TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                              RTT

                                                                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                                                                              requestobject

                                                                                                                                                                                                                                              first window= SR

                                                                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                                                                              delivered

                                                                                                                                                                                                                                              time atclient

                                                                                                                                                                                                                                              time atserver

                                                                                                                                                                                                                                              Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                              Server idles P=2 times

                                                                                                                                                                                                                                              Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                              Server idles P = minK-1Q times

                                                                                                                                                                                                                                              3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                              TCP Latency Modeling (3)

                                                                                                                                                                                                                                              ementacknowledg receivesserver until

                                                                                                                                                                                                                                              segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                              RS

                                                                                                                                                                                                                                              RSRTTPRTT

                                                                                                                                                                                                                                              RO

                                                                                                                                                                                                                                              RSRTT

                                                                                                                                                                                                                                              RSRTT

                                                                                                                                                                                                                                              RO

                                                                                                                                                                                                                                              idleTimeRTTRO

                                                                                                                                                                                                                                              P

                                                                                                                                                                                                                                              kP

                                                                                                                                                                                                                                              k

                                                                                                                                                                                                                                              P

                                                                                                                                                                                                                                              pp

                                                                                                                                                                                                                                              )12(][2

                                                                                                                                                                                                                                              ]2[2

                                                                                                                                                                                                                                              2delay

                                                                                                                                                                                                                                              1

                                                                                                                                                                                                                                              1

                                                                                                                                                                                                                                              1

                                                                                                                                                                                                                                              minusminus+++=

                                                                                                                                                                                                                                              minus+++=

                                                                                                                                                                                                                                              ++=

                                                                                                                                                                                                                                              minus

                                                                                                                                                                                                                                              =

                                                                                                                                                                                                                                              =

                                                                                                                                                                                                                                              sum

                                                                                                                                                                                                                                              sum

                                                                                                                                                                                                                                              th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                              RS k =⎥⎦

                                                                                                                                                                                                                                              ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                              +minus

                                                                                                                                                                                                                                              window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                              RSk

                                                                                                                                                                                                                                              RTT

                                                                                                                                                                                                                                              initiate TCPconnection

                                                                                                                                                                                                                                              requestobject

                                                                                                                                                                                                                                              first window= SR

                                                                                                                                                                                                                                              second window= 2SR

                                                                                                                                                                                                                                              third window= 4SR

                                                                                                                                                                                                                                              fourth window= 8SR

                                                                                                                                                                                                                                              completetransmissionobject

                                                                                                                                                                                                                                              delivered

                                                                                                                                                                                                                                              time atclient

                                                                                                                                                                                                                                              time atserver

                                                                                                                                                                                                                                              3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                              TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                              How do we calculate K

                                                                                                                                                                                                                                              ⎥⎥⎤

                                                                                                                                                                                                                                              ⎢⎢⎡ +=

                                                                                                                                                                                                                                              +ge=

                                                                                                                                                                                                                                              geminus=

                                                                                                                                                                                                                                              ge+++=

                                                                                                                                                                                                                                              ge+++=minus

                                                                                                                                                                                                                                              minus

                                                                                                                                                                                                                                              )1(log

                                                                                                                                                                                                                                              )1(logmin

                                                                                                                                                                                                                                              12min

                                                                                                                                                                                                                                              222min222min

                                                                                                                                                                                                                                              2

                                                                                                                                                                                                                                              2

                                                                                                                                                                                                                                              110

                                                                                                                                                                                                                                              110

                                                                                                                                                                                                                                              SO

                                                                                                                                                                                                                                              SOkk

                                                                                                                                                                                                                                              SOk

                                                                                                                                                                                                                                              SOkOSSSkK

                                                                                                                                                                                                                                              k

                                                                                                                                                                                                                                              k

                                                                                                                                                                                                                                              k

                                                                                                                                                                                                                                              L

                                                                                                                                                                                                                                              L

                                                                                                                                                                                                                                              Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                              3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                              HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                              1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                              Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                              Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                              Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                              02468

                                                                                                                                                                                                                                              101214161820

                                                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                                                              persistent

                                                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                                                                                                              0

                                                                                                                                                                                                                                              10

                                                                                                                                                                                                                                              20

                                                                                                                                                                                                                                              30

                                                                                                                                                                                                                                              40

                                                                                                                                                                                                                                              50

                                                                                                                                                                                                                                              60

                                                                                                                                                                                                                                              70

                                                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                                                              persistent

                                                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                                                                                                              UDPTCP

                                                                                                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • Transport services and protocols
                                                                                                                                                                                                                                              • Transport vs network layer
                                                                                                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                                                              • How demultiplexing works
                                                                                                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                              • UDP more
                                                                                                                                                                                                                                              • UDP checksum
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                                                              • Incremental Improvements
                                                                                                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                              • rdt21 discussion
                                                                                                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                                                                                                              • rdt30 sender
                                                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                                                              • Performance of rdt30
                                                                                                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                                                                                                              • Go-Back-N
                                                                                                                                                                                                                                              • GBN Sender
                                                                                                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                                                                                                              • More on receiver
                                                                                                                                                                                                                                              • GBN inaction
                                                                                                                                                                                                                                              • Selective Repeat
                                                                                                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                                                                                                              • Selective repeat
                                                                                                                                                                                                                                              • Selective repeat in action
                                                                                                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                              • More TCP Details
                                                                                                                                                                                                                                              • Even More TCP Details
                                                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                              • Example RTT estimation
                                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                                                                                                              • TCP sender events
                                                                                                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                              • More on Sender Policies
                                                                                                                                                                                                                                              • Fast Retransmit
                                                                                                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                                                                                                              • Technical Issue
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • TCP Connection Management
                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                              • A few special cases
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                                                                                                              • TCP AIMD
                                                                                                                                                                                                                                              • TCP Slow Start
                                                                                                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                                                                                                              • The Big Picture
                                                                                                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                                                                                                              • TCP throughput
                                                                                                                                                                                                                                              • TCP Futures
                                                                                                                                                                                                                                              • TCP Fairness
                                                                                                                                                                                                                                              • Why is TCP fair
                                                                                                                                                                                                                                              • Fairness (more)
                                                                                                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                                                                                                              • HTTP Modeling
                                                                                                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                                                                                                3 Transport Layer 120Comp 361 Spring 2005

                                                                                                                                                                                                                                                Fixed congestion window (1)

                                                                                                                                                                                                                                                First caseWSR gt RTT + SR ACK for

                                                                                                                                                                                                                                                first segment in window returns before windowrsquos worth of data sent

                                                                                                                                                                                                                                                latency = 2RTT + OR

                                                                                                                                                                                                                                                3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                                Fixed congestion window (2)

                                                                                                                                                                                                                                                Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                                latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                                3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                                TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                                Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                                Will show that the delay for one object is

                                                                                                                                                                                                                                                RS

                                                                                                                                                                                                                                                RSRTTP

                                                                                                                                                                                                                                                RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                                ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                                where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                                - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                                - and K is the number of windows that cover the object

                                                                                                                                                                                                                                                3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                                TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                                                                requestobject

                                                                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                                                                delivered

                                                                                                                                                                                                                                                time atclient

                                                                                                                                                                                                                                                time atserver

                                                                                                                                                                                                                                                Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                                Server idles P=2 times

                                                                                                                                                                                                                                                Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                                Server idles P = minK-1Q times

                                                                                                                                                                                                                                                3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                                TCP Latency Modeling (3)

                                                                                                                                                                                                                                                ementacknowledg receivesserver until

                                                                                                                                                                                                                                                segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                                RS

                                                                                                                                                                                                                                                RSRTTPRTT

                                                                                                                                                                                                                                                RO

                                                                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                                                                RSRTT

                                                                                                                                                                                                                                                RO

                                                                                                                                                                                                                                                idleTimeRTTRO

                                                                                                                                                                                                                                                P

                                                                                                                                                                                                                                                kP

                                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                                P

                                                                                                                                                                                                                                                pp

                                                                                                                                                                                                                                                )12(][2

                                                                                                                                                                                                                                                ]2[2

                                                                                                                                                                                                                                                2delay

                                                                                                                                                                                                                                                1

                                                                                                                                                                                                                                                1

                                                                                                                                                                                                                                                1

                                                                                                                                                                                                                                                minusminus+++=

                                                                                                                                                                                                                                                minus+++=

                                                                                                                                                                                                                                                ++=

                                                                                                                                                                                                                                                minus

                                                                                                                                                                                                                                                =

                                                                                                                                                                                                                                                =

                                                                                                                                                                                                                                                sum

                                                                                                                                                                                                                                                sum

                                                                                                                                                                                                                                                th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                                RS k =⎥⎦

                                                                                                                                                                                                                                                ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                                +minus

                                                                                                                                                                                                                                                window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                                RSk

                                                                                                                                                                                                                                                RTT

                                                                                                                                                                                                                                                initiate TCPconnection

                                                                                                                                                                                                                                                requestobject

                                                                                                                                                                                                                                                first window= SR

                                                                                                                                                                                                                                                second window= 2SR

                                                                                                                                                                                                                                                third window= 4SR

                                                                                                                                                                                                                                                fourth window= 8SR

                                                                                                                                                                                                                                                completetransmissionobject

                                                                                                                                                                                                                                                delivered

                                                                                                                                                                                                                                                time atclient

                                                                                                                                                                                                                                                time atserver

                                                                                                                                                                                                                                                3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                                TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                                How do we calculate K

                                                                                                                                                                                                                                                ⎥⎥⎤

                                                                                                                                                                                                                                                ⎢⎢⎡ +=

                                                                                                                                                                                                                                                +ge=

                                                                                                                                                                                                                                                geminus=

                                                                                                                                                                                                                                                ge+++=

                                                                                                                                                                                                                                                ge+++=minus

                                                                                                                                                                                                                                                minus

                                                                                                                                                                                                                                                )1(log

                                                                                                                                                                                                                                                )1(logmin

                                                                                                                                                                                                                                                12min

                                                                                                                                                                                                                                                222min222min

                                                                                                                                                                                                                                                2

                                                                                                                                                                                                                                                2

                                                                                                                                                                                                                                                110

                                                                                                                                                                                                                                                110

                                                                                                                                                                                                                                                SO

                                                                                                                                                                                                                                                SOkk

                                                                                                                                                                                                                                                SOk

                                                                                                                                                                                                                                                SOkOSSSkK

                                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                                k

                                                                                                                                                                                                                                                L

                                                                                                                                                                                                                                                L

                                                                                                                                                                                                                                                Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                                3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                02468

                                                                                                                                                                                                                                                101214161820

                                                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                                                For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                                                                                                0

                                                                                                                                                                                                                                                10

                                                                                                                                                                                                                                                20

                                                                                                                                                                                                                                                30

                                                                                                                                                                                                                                                40

                                                                                                                                                                                                                                                50

                                                                                                                                                                                                                                                60

                                                                                                                                                                                                                                                70

                                                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                                                                                                UDPTCP

                                                                                                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                • UDP more
                                                                                                                                                                                                                                                • UDP checksum
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                • rdt30 sender
                                                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                                                                                                • Go-Back-N
                                                                                                                                                                                                                                                • GBN Sender
                                                                                                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                                                                                                • More on receiver
                                                                                                                                                                                                                                                • GBN inaction
                                                                                                                                                                                                                                                • Selective Repeat
                                                                                                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                • Selective repeat
                                                                                                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                • More TCP Details
                                                                                                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                                                                                                • TCP sender events
                                                                                                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                                                                                                • Technical Issue
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                • A few special cases
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                                                                                                • TCP AIMD
                                                                                                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                                                                                                • The Big Picture
                                                                                                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                                                                                                • TCP throughput
                                                                                                                                                                                                                                                • TCP Futures
                                                                                                                                                                                                                                                • TCP Fairness
                                                                                                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                                                                                                • Fairness (more)
                                                                                                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                                                                                                  3 Transport Layer 121Comp 361 Spring 2005

                                                                                                                                                                                                                                                  Fixed congestion window (2)

                                                                                                                                                                                                                                                  Second caseWSR lt RTT + SR wait for ACK after sending windowrsquos worth of data sent

                                                                                                                                                                                                                                                  latency = 2RTT + OR+ (K-1)[SR + RTT - WSR]

                                                                                                                                                                                                                                                  3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                                  TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                                  Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                                  Will show that the delay for one object is

                                                                                                                                                                                                                                                  RS

                                                                                                                                                                                                                                                  RSRTTP

                                                                                                                                                                                                                                                  RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                                  ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                                  where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                                  - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                                  - and K is the number of windows that cover the object

                                                                                                                                                                                                                                                  3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                                  TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                                                                  delivered

                                                                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                                                                  Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                                  Server idles P=2 times

                                                                                                                                                                                                                                                  Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                                  Server idles P = minK-1Q times

                                                                                                                                                                                                                                                  3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                                  TCP Latency Modeling (3)

                                                                                                                                                                                                                                                  ementacknowledg receivesserver until

                                                                                                                                                                                                                                                  segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                                  RS

                                                                                                                                                                                                                                                  RSRTTPRTT

                                                                                                                                                                                                                                                  RO

                                                                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                                                                  RSRTT

                                                                                                                                                                                                                                                  RO

                                                                                                                                                                                                                                                  idleTimeRTTRO

                                                                                                                                                                                                                                                  P

                                                                                                                                                                                                                                                  kP

                                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                                  P

                                                                                                                                                                                                                                                  pp

                                                                                                                                                                                                                                                  )12(][2

                                                                                                                                                                                                                                                  ]2[2

                                                                                                                                                                                                                                                  2delay

                                                                                                                                                                                                                                                  1

                                                                                                                                                                                                                                                  1

                                                                                                                                                                                                                                                  1

                                                                                                                                                                                                                                                  minusminus+++=

                                                                                                                                                                                                                                                  minus+++=

                                                                                                                                                                                                                                                  ++=

                                                                                                                                                                                                                                                  minus

                                                                                                                                                                                                                                                  =

                                                                                                                                                                                                                                                  =

                                                                                                                                                                                                                                                  sum

                                                                                                                                                                                                                                                  sum

                                                                                                                                                                                                                                                  th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                                  RS k =⎥⎦

                                                                                                                                                                                                                                                  ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                                  +minus

                                                                                                                                                                                                                                                  window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                                  RSk

                                                                                                                                                                                                                                                  RTT

                                                                                                                                                                                                                                                  initiate TCPconnection

                                                                                                                                                                                                                                                  requestobject

                                                                                                                                                                                                                                                  first window= SR

                                                                                                                                                                                                                                                  second window= 2SR

                                                                                                                                                                                                                                                  third window= 4SR

                                                                                                                                                                                                                                                  fourth window= 8SR

                                                                                                                                                                                                                                                  completetransmissionobject

                                                                                                                                                                                                                                                  delivered

                                                                                                                                                                                                                                                  time atclient

                                                                                                                                                                                                                                                  time atserver

                                                                                                                                                                                                                                                  3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                                  TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                                  How do we calculate K

                                                                                                                                                                                                                                                  ⎥⎥⎤

                                                                                                                                                                                                                                                  ⎢⎢⎡ +=

                                                                                                                                                                                                                                                  +ge=

                                                                                                                                                                                                                                                  geminus=

                                                                                                                                                                                                                                                  ge+++=

                                                                                                                                                                                                                                                  ge+++=minus

                                                                                                                                                                                                                                                  minus

                                                                                                                                                                                                                                                  )1(log

                                                                                                                                                                                                                                                  )1(logmin

                                                                                                                                                                                                                                                  12min

                                                                                                                                                                                                                                                  222min222min

                                                                                                                                                                                                                                                  2

                                                                                                                                                                                                                                                  2

                                                                                                                                                                                                                                                  110

                                                                                                                                                                                                                                                  110

                                                                                                                                                                                                                                                  SO

                                                                                                                                                                                                                                                  SOkk

                                                                                                                                                                                                                                                  SOk

                                                                                                                                                                                                                                                  SOkOSSSkK

                                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                                  k

                                                                                                                                                                                                                                                  L

                                                                                                                                                                                                                                                  L

                                                                                                                                                                                                                                                  Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                                  3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                  HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                  1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                  Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                  Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                  Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                  3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                  HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                  02468

                                                                                                                                                                                                                                                  101214161820

                                                                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                                                                  persistent

                                                                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                                                                  For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                  3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                  HTTP Response time (in seconds)

                                                                                                                                                                                                                                                  0

                                                                                                                                                                                                                                                  10

                                                                                                                                                                                                                                                  20

                                                                                                                                                                                                                                                  30

                                                                                                                                                                                                                                                  40

                                                                                                                                                                                                                                                  50

                                                                                                                                                                                                                                                  60

                                                                                                                                                                                                                                                  70

                                                                                                                                                                                                                                                  28Kbps

                                                                                                                                                                                                                                                  100Kbps

                                                                                                                                                                                                                                                  1 Mbps 10Mbps

                                                                                                                                                                                                                                                  non-persistent

                                                                                                                                                                                                                                                  persistent

                                                                                                                                                                                                                                                  parallel non-persistent

                                                                                                                                                                                                                                                  RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                  For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                                                                                                  UDPTCP

                                                                                                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                  • UDP more
                                                                                                                                                                                                                                                  • UDP checksum
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                                                                                                  • GBN Sender
                                                                                                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                                                                                                  • More on receiver
                                                                                                                                                                                                                                                  • GBN inaction
                                                                                                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                  • Selective repeat
                                                                                                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                  • More TCP Details
                                                                                                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                                                                                                  • TCP sender events
                                                                                                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                                                                                                  • Technical Issue
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                  • A few special cases
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                                                                                                  • The Big Picture
                                                                                                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                                                                                                  • TCP throughput
                                                                                                                                                                                                                                                  • TCP Futures
                                                                                                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                                                                                                    3 Transport Layer 122Comp 361 Spring 2005

                                                                                                                                                                                                                                                    TCP Latency Modeling Slow Start (1)

                                                                                                                                                                                                                                                    Now suppose window grows according to slow start(with no threshold and no loss events)

                                                                                                                                                                                                                                                    Will show that the delay for one object is

                                                                                                                                                                                                                                                    RS

                                                                                                                                                                                                                                                    RSRTTP

                                                                                                                                                                                                                                                    RORTTLatency P )12(2 minusminus⎥⎦

                                                                                                                                                                                                                                                    ⎤⎢⎣⎡ +++=

                                                                                                                                                                                                                                                    where P is the number of times TCP idles at server1min minus= KQP

                                                                                                                                                                                                                                                    - where Q is the number of times the server idlesif the object were of infinite size

                                                                                                                                                                                                                                                    - and K is the number of windows that cover the object

                                                                                                                                                                                                                                                    3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                                    TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                                                                    delivered

                                                                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                                                                    Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                                    Server idles P=2 times

                                                                                                                                                                                                                                                    Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                                    Server idles P = minK-1Q times

                                                                                                                                                                                                                                                    3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                                    TCP Latency Modeling (3)

                                                                                                                                                                                                                                                    ementacknowledg receivesserver until

                                                                                                                                                                                                                                                    segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                                    RS

                                                                                                                                                                                                                                                    RSRTTPRTT

                                                                                                                                                                                                                                                    RO

                                                                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                                                                    RSRTT

                                                                                                                                                                                                                                                    RO

                                                                                                                                                                                                                                                    idleTimeRTTRO

                                                                                                                                                                                                                                                    P

                                                                                                                                                                                                                                                    kP

                                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                                    P

                                                                                                                                                                                                                                                    pp

                                                                                                                                                                                                                                                    )12(][2

                                                                                                                                                                                                                                                    ]2[2

                                                                                                                                                                                                                                                    2delay

                                                                                                                                                                                                                                                    1

                                                                                                                                                                                                                                                    1

                                                                                                                                                                                                                                                    1

                                                                                                                                                                                                                                                    minusminus+++=

                                                                                                                                                                                                                                                    minus+++=

                                                                                                                                                                                                                                                    ++=

                                                                                                                                                                                                                                                    minus

                                                                                                                                                                                                                                                    =

                                                                                                                                                                                                                                                    =

                                                                                                                                                                                                                                                    sum

                                                                                                                                                                                                                                                    sum

                                                                                                                                                                                                                                                    th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                                    RS k =⎥⎦

                                                                                                                                                                                                                                                    ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                                    +minus

                                                                                                                                                                                                                                                    window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                                    RSk

                                                                                                                                                                                                                                                    RTT

                                                                                                                                                                                                                                                    initiate TCPconnection

                                                                                                                                                                                                                                                    requestobject

                                                                                                                                                                                                                                                    first window= SR

                                                                                                                                                                                                                                                    second window= 2SR

                                                                                                                                                                                                                                                    third window= 4SR

                                                                                                                                                                                                                                                    fourth window= 8SR

                                                                                                                                                                                                                                                    completetransmissionobject

                                                                                                                                                                                                                                                    delivered

                                                                                                                                                                                                                                                    time atclient

                                                                                                                                                                                                                                                    time atserver

                                                                                                                                                                                                                                                    3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                                    TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                                    How do we calculate K

                                                                                                                                                                                                                                                    ⎥⎥⎤

                                                                                                                                                                                                                                                    ⎢⎢⎡ +=

                                                                                                                                                                                                                                                    +ge=

                                                                                                                                                                                                                                                    geminus=

                                                                                                                                                                                                                                                    ge+++=

                                                                                                                                                                                                                                                    ge+++=minus

                                                                                                                                                                                                                                                    minus

                                                                                                                                                                                                                                                    )1(log

                                                                                                                                                                                                                                                    )1(logmin

                                                                                                                                                                                                                                                    12min

                                                                                                                                                                                                                                                    222min222min

                                                                                                                                                                                                                                                    2

                                                                                                                                                                                                                                                    2

                                                                                                                                                                                                                                                    110

                                                                                                                                                                                                                                                    110

                                                                                                                                                                                                                                                    SO

                                                                                                                                                                                                                                                    SOkk

                                                                                                                                                                                                                                                    SOk

                                                                                                                                                                                                                                                    SOkOSSSkK

                                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                                    k

                                                                                                                                                                                                                                                    L

                                                                                                                                                                                                                                                    L

                                                                                                                                                                                                                                                    Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                                    3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                    HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                    1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                    Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                    Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                    Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                    3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                    HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                    02468

                                                                                                                                                                                                                                                    101214161820

                                                                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                                                                    persistent

                                                                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                                                                    For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                    3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                    HTTP Response time (in seconds)

                                                                                                                                                                                                                                                    0

                                                                                                                                                                                                                                                    10

                                                                                                                                                                                                                                                    20

                                                                                                                                                                                                                                                    30

                                                                                                                                                                                                                                                    40

                                                                                                                                                                                                                                                    50

                                                                                                                                                                                                                                                    60

                                                                                                                                                                                                                                                    70

                                                                                                                                                                                                                                                    28Kbps

                                                                                                                                                                                                                                                    100Kbps

                                                                                                                                                                                                                                                    1 Mbps 10Mbps

                                                                                                                                                                                                                                                    non-persistent

                                                                                                                                                                                                                                                    persistent

                                                                                                                                                                                                                                                    parallel non-persistent

                                                                                                                                                                                                                                                    RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                    For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                    3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                    Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                    multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                    instantiation and implementation in the Internet

                                                                                                                                                                                                                                                    UDPTCP

                                                                                                                                                                                                                                                    Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                    • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • Transport services and protocols
                                                                                                                                                                                                                                                    • Transport vs network layer
                                                                                                                                                                                                                                                    • Transport-layer protocols
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                    • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                    • How demultiplexing works
                                                                                                                                                                                                                                                    • Connectionless demultiplexing
                                                                                                                                                                                                                                                    • Connectionless demux (cont)
                                                                                                                                                                                                                                                    • Connection-oriented demux
                                                                                                                                                                                                                                                    • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                    • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                    • UDP more
                                                                                                                                                                                                                                                    • UDP checksum
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • Principles of Reliable data transfer
                                                                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                                                                    • Reliable data transfer getting started
                                                                                                                                                                                                                                                    • Incremental Improvements
                                                                                                                                                                                                                                                    • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                    • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                    • rdt20 FSM specification
                                                                                                                                                                                                                                                    • rdt20 operation with no errors
                                                                                                                                                                                                                                                    • rdt20 error scenario
                                                                                                                                                                                                                                                    • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                    • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                    • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                    • rdt21 discussion
                                                                                                                                                                                                                                                    • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                    • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                    • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                    • rdt30 sender
                                                                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                                                                    • rdt30 in action
                                                                                                                                                                                                                                                    • Performance of rdt30
                                                                                                                                                                                                                                                    • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                                                                    • Pipelined protocols
                                                                                                                                                                                                                                                    • Pipelining increased utilization
                                                                                                                                                                                                                                                    • Go-Back-N
                                                                                                                                                                                                                                                    • GBN Sender
                                                                                                                                                                                                                                                    • GBN sender extended FSM
                                                                                                                                                                                                                                                    • GBN receiver extended FSM
                                                                                                                                                                                                                                                    • More on receiver
                                                                                                                                                                                                                                                    • GBN inaction
                                                                                                                                                                                                                                                    • Selective Repeat
                                                                                                                                                                                                                                                    • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                    • Selective repeat
                                                                                                                                                                                                                                                    • Selective repeat in action
                                                                                                                                                                                                                                                    • Selective repeat dilemma
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                    • More TCP Details
                                                                                                                                                                                                                                                    • Even More TCP Details
                                                                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                                                                    • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                    • Example RTT estimation
                                                                                                                                                                                                                                                    • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • TCP reliable data transfer
                                                                                                                                                                                                                                                    • TCP sender events
                                                                                                                                                                                                                                                    • TCP sender(simplified)
                                                                                                                                                                                                                                                    • TCP retransmission scenarios
                                                                                                                                                                                                                                                    • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                    • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                    • More on Sender Policies
                                                                                                                                                                                                                                                    • Fast Retransmit
                                                                                                                                                                                                                                                    • Fast retransmit algorithm
                                                                                                                                                                                                                                                    • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                                                                    • TCP Flow Control
                                                                                                                                                                                                                                                    • TCP segment structure
                                                                                                                                                                                                                                                    • TCP Flow control how it works
                                                                                                                                                                                                                                                    • Technical Issue
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • TCP Connection Management
                                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                                    • TCP Connection Management (cont)
                                                                                                                                                                                                                                                    • A few special cases
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • Principles of Congestion Control
                                                                                                                                                                                                                                                    • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                    • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                    • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                    • Approaches towards congestion control
                                                                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                    • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                    • Chapter 3 outline
                                                                                                                                                                                                                                                    • TCP Congestion Control
                                                                                                                                                                                                                                                    • TCP AIMD
                                                                                                                                                                                                                                                    • TCP Slow Start
                                                                                                                                                                                                                                                    • TCP Slow Start (more)
                                                                                                                                                                                                                                                    • Summary TCP Congestion Control
                                                                                                                                                                                                                                                    • The Big Picture
                                                                                                                                                                                                                                                    • TCP sender congestion control
                                                                                                                                                                                                                                                    • TCP throughput
                                                                                                                                                                                                                                                    • TCP Futures
                                                                                                                                                                                                                                                    • TCP Fairness
                                                                                                                                                                                                                                                    • Why is TCP fair
                                                                                                                                                                                                                                                    • Fairness (more)
                                                                                                                                                                                                                                                    • TCP Latency Modeling
                                                                                                                                                                                                                                                    • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                    • Fixed congestion window (1)
                                                                                                                                                                                                                                                    • Fixed congestion window (2)
                                                                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                    • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                    • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                    • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                    • HTTP Modeling
                                                                                                                                                                                                                                                    • Chapter 3 Summary

                                                                                                                                                                                                                                                      3 Transport Layer 123Comp 361 Spring 2005

                                                                                                                                                                                                                                                      TCP Latency Modeling Slow Start (2)

                                                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                                                                      delivered

                                                                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                                                                      Examplebull OS = 15 segmentsbull K = 4 windowsbull Q = 2bull P = minK-1Q = 2

                                                                                                                                                                                                                                                      Server idles P=2 times

                                                                                                                                                                                                                                                      Delay componentsbull 2 RTT for connection estab and requestbull OR to transmit objectbull time server idles due to slow start

                                                                                                                                                                                                                                                      Server idles P = minK-1Q times

                                                                                                                                                                                                                                                      3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                                      TCP Latency Modeling (3)

                                                                                                                                                                                                                                                      ementacknowledg receivesserver until

                                                                                                                                                                                                                                                      segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                                      RS

                                                                                                                                                                                                                                                      RSRTTPRTT

                                                                                                                                                                                                                                                      RO

                                                                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                                                                      RSRTT

                                                                                                                                                                                                                                                      RO

                                                                                                                                                                                                                                                      idleTimeRTTRO

                                                                                                                                                                                                                                                      P

                                                                                                                                                                                                                                                      kP

                                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                                      P

                                                                                                                                                                                                                                                      pp

                                                                                                                                                                                                                                                      )12(][2

                                                                                                                                                                                                                                                      ]2[2

                                                                                                                                                                                                                                                      2delay

                                                                                                                                                                                                                                                      1

                                                                                                                                                                                                                                                      1

                                                                                                                                                                                                                                                      1

                                                                                                                                                                                                                                                      minusminus+++=

                                                                                                                                                                                                                                                      minus+++=

                                                                                                                                                                                                                                                      ++=

                                                                                                                                                                                                                                                      minus

                                                                                                                                                                                                                                                      =

                                                                                                                                                                                                                                                      =

                                                                                                                                                                                                                                                      sum

                                                                                                                                                                                                                                                      sum

                                                                                                                                                                                                                                                      th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                                      RS k =⎥⎦

                                                                                                                                                                                                                                                      ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                                      +minus

                                                                                                                                                                                                                                                      window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                                      RSk

                                                                                                                                                                                                                                                      RTT

                                                                                                                                                                                                                                                      initiate TCPconnection

                                                                                                                                                                                                                                                      requestobject

                                                                                                                                                                                                                                                      first window= SR

                                                                                                                                                                                                                                                      second window= 2SR

                                                                                                                                                                                                                                                      third window= 4SR

                                                                                                                                                                                                                                                      fourth window= 8SR

                                                                                                                                                                                                                                                      completetransmissionobject

                                                                                                                                                                                                                                                      delivered

                                                                                                                                                                                                                                                      time atclient

                                                                                                                                                                                                                                                      time atserver

                                                                                                                                                                                                                                                      3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                                      TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                                      How do we calculate K

                                                                                                                                                                                                                                                      ⎥⎥⎤

                                                                                                                                                                                                                                                      ⎢⎢⎡ +=

                                                                                                                                                                                                                                                      +ge=

                                                                                                                                                                                                                                                      geminus=

                                                                                                                                                                                                                                                      ge+++=

                                                                                                                                                                                                                                                      ge+++=minus

                                                                                                                                                                                                                                                      minus

                                                                                                                                                                                                                                                      )1(log

                                                                                                                                                                                                                                                      )1(logmin

                                                                                                                                                                                                                                                      12min

                                                                                                                                                                                                                                                      222min222min

                                                                                                                                                                                                                                                      2

                                                                                                                                                                                                                                                      2

                                                                                                                                                                                                                                                      110

                                                                                                                                                                                                                                                      110

                                                                                                                                                                                                                                                      SO

                                                                                                                                                                                                                                                      SOkk

                                                                                                                                                                                                                                                      SOk

                                                                                                                                                                                                                                                      SOkOSSSkK

                                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                                      k

                                                                                                                                                                                                                                                      L

                                                                                                                                                                                                                                                      L

                                                                                                                                                                                                                                                      Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                                      3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                      HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                      1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                      Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                      Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                      Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                      3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                      HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                      02468

                                                                                                                                                                                                                                                      101214161820

                                                                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                                                                      persistent

                                                                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                                                                      For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                      3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                      HTTP Response time (in seconds)

                                                                                                                                                                                                                                                      0

                                                                                                                                                                                                                                                      10

                                                                                                                                                                                                                                                      20

                                                                                                                                                                                                                                                      30

                                                                                                                                                                                                                                                      40

                                                                                                                                                                                                                                                      50

                                                                                                                                                                                                                                                      60

                                                                                                                                                                                                                                                      70

                                                                                                                                                                                                                                                      28Kbps

                                                                                                                                                                                                                                                      100Kbps

                                                                                                                                                                                                                                                      1 Mbps 10Mbps

                                                                                                                                                                                                                                                      non-persistent

                                                                                                                                                                                                                                                      persistent

                                                                                                                                                                                                                                                      parallel non-persistent

                                                                                                                                                                                                                                                      RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                      For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                      3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                      Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                      multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                      instantiation and implementation in the Internet

                                                                                                                                                                                                                                                      UDPTCP

                                                                                                                                                                                                                                                      Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                      • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • Transport services and protocols
                                                                                                                                                                                                                                                      • Transport vs network layer
                                                                                                                                                                                                                                                      • Transport-layer protocols
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                      • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                      • How demultiplexing works
                                                                                                                                                                                                                                                      • Connectionless demultiplexing
                                                                                                                                                                                                                                                      • Connectionless demux (cont)
                                                                                                                                                                                                                                                      • Connection-oriented demux
                                                                                                                                                                                                                                                      • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                      • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                      • UDP more
                                                                                                                                                                                                                                                      • UDP checksum
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • Principles of Reliable data transfer
                                                                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                                                                      • Reliable data transfer getting started
                                                                                                                                                                                                                                                      • Incremental Improvements
                                                                                                                                                                                                                                                      • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                      • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                      • rdt20 FSM specification
                                                                                                                                                                                                                                                      • rdt20 operation with no errors
                                                                                                                                                                                                                                                      • rdt20 error scenario
                                                                                                                                                                                                                                                      • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                      • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                      • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                      • rdt21 discussion
                                                                                                                                                                                                                                                      • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                      • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                      • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                      • rdt30 sender
                                                                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                                                                      • rdt30 in action
                                                                                                                                                                                                                                                      • Performance of rdt30
                                                                                                                                                                                                                                                      • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                                                                      • Pipelined protocols
                                                                                                                                                                                                                                                      • Pipelining increased utilization
                                                                                                                                                                                                                                                      • Go-Back-N
                                                                                                                                                                                                                                                      • GBN Sender
                                                                                                                                                                                                                                                      • GBN sender extended FSM
                                                                                                                                                                                                                                                      • GBN receiver extended FSM
                                                                                                                                                                                                                                                      • More on receiver
                                                                                                                                                                                                                                                      • GBN inaction
                                                                                                                                                                                                                                                      • Selective Repeat
                                                                                                                                                                                                                                                      • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                      • Selective repeat
                                                                                                                                                                                                                                                      • Selective repeat in action
                                                                                                                                                                                                                                                      • Selective repeat dilemma
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                      • More TCP Details
                                                                                                                                                                                                                                                      • Even More TCP Details
                                                                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                                                                      • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                      • Example RTT estimation
                                                                                                                                                                                                                                                      • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • TCP reliable data transfer
                                                                                                                                                                                                                                                      • TCP sender events
                                                                                                                                                                                                                                                      • TCP sender(simplified)
                                                                                                                                                                                                                                                      • TCP retransmission scenarios
                                                                                                                                                                                                                                                      • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                      • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                      • More on Sender Policies
                                                                                                                                                                                                                                                      • Fast Retransmit
                                                                                                                                                                                                                                                      • Fast retransmit algorithm
                                                                                                                                                                                                                                                      • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                                                                      • TCP Flow Control
                                                                                                                                                                                                                                                      • TCP segment structure
                                                                                                                                                                                                                                                      • TCP Flow control how it works
                                                                                                                                                                                                                                                      • Technical Issue
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • TCP Connection Management
                                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                                      • TCP Connection Management (cont)
                                                                                                                                                                                                                                                      • A few special cases
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • Principles of Congestion Control
                                                                                                                                                                                                                                                      • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                      • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                      • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                      • Approaches towards congestion control
                                                                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                      • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                      • Chapter 3 outline
                                                                                                                                                                                                                                                      • TCP Congestion Control
                                                                                                                                                                                                                                                      • TCP AIMD
                                                                                                                                                                                                                                                      • TCP Slow Start
                                                                                                                                                                                                                                                      • TCP Slow Start (more)
                                                                                                                                                                                                                                                      • Summary TCP Congestion Control
                                                                                                                                                                                                                                                      • The Big Picture
                                                                                                                                                                                                                                                      • TCP sender congestion control
                                                                                                                                                                                                                                                      • TCP throughput
                                                                                                                                                                                                                                                      • TCP Futures
                                                                                                                                                                                                                                                      • TCP Fairness
                                                                                                                                                                                                                                                      • Why is TCP fair
                                                                                                                                                                                                                                                      • Fairness (more)
                                                                                                                                                                                                                                                      • TCP Latency Modeling
                                                                                                                                                                                                                                                      • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                      • Fixed congestion window (1)
                                                                                                                                                                                                                                                      • Fixed congestion window (2)
                                                                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                      • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                      • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                      • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                      • HTTP Modeling
                                                                                                                                                                                                                                                      • Chapter 3 Summary

                                                                                                                                                                                                                                                        3 Transport Layer 124Comp 361 Spring 2005

                                                                                                                                                                                                                                                        TCP Latency Modeling (3)

                                                                                                                                                                                                                                                        ementacknowledg receivesserver until

                                                                                                                                                                                                                                                        segment send tostartsserver whenfrom time=+ RTTRS

                                                                                                                                                                                                                                                        RS

                                                                                                                                                                                                                                                        RSRTTPRTT

                                                                                                                                                                                                                                                        RO

                                                                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                                                                        RSRTT

                                                                                                                                                                                                                                                        RO

                                                                                                                                                                                                                                                        idleTimeRTTRO

                                                                                                                                                                                                                                                        P

                                                                                                                                                                                                                                                        kP

                                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                                        P

                                                                                                                                                                                                                                                        pp

                                                                                                                                                                                                                                                        )12(][2

                                                                                                                                                                                                                                                        ]2[2

                                                                                                                                                                                                                                                        2delay

                                                                                                                                                                                                                                                        1

                                                                                                                                                                                                                                                        1

                                                                                                                                                                                                                                                        1

                                                                                                                                                                                                                                                        minusminus+++=

                                                                                                                                                                                                                                                        minus+++=

                                                                                                                                                                                                                                                        ++=

                                                                                                                                                                                                                                                        minus

                                                                                                                                                                                                                                                        =

                                                                                                                                                                                                                                                        =

                                                                                                                                                                                                                                                        sum

                                                                                                                                                                                                                                                        sum

                                                                                                                                                                                                                                                        th window after the timeidle 2 1 kRSRTT

                                                                                                                                                                                                                                                        RS k =⎥⎦

                                                                                                                                                                                                                                                        ⎤⎢⎣⎡ minus+

                                                                                                                                                                                                                                                        +minus

                                                                                                                                                                                                                                                        window kth the transmit totime2 1 =minus

                                                                                                                                                                                                                                                        RSk

                                                                                                                                                                                                                                                        RTT

                                                                                                                                                                                                                                                        initiate TCPconnection

                                                                                                                                                                                                                                                        requestobject

                                                                                                                                                                                                                                                        first window= SR

                                                                                                                                                                                                                                                        second window= 2SR

                                                                                                                                                                                                                                                        third window= 4SR

                                                                                                                                                                                                                                                        fourth window= 8SR

                                                                                                                                                                                                                                                        completetransmissionobject

                                                                                                                                                                                                                                                        delivered

                                                                                                                                                                                                                                                        time atclient

                                                                                                                                                                                                                                                        time atserver

                                                                                                                                                                                                                                                        3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                                        TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                                        How do we calculate K

                                                                                                                                                                                                                                                        ⎥⎥⎤

                                                                                                                                                                                                                                                        ⎢⎢⎡ +=

                                                                                                                                                                                                                                                        +ge=

                                                                                                                                                                                                                                                        geminus=

                                                                                                                                                                                                                                                        ge+++=

                                                                                                                                                                                                                                                        ge+++=minus

                                                                                                                                                                                                                                                        minus

                                                                                                                                                                                                                                                        )1(log

                                                                                                                                                                                                                                                        )1(logmin

                                                                                                                                                                                                                                                        12min

                                                                                                                                                                                                                                                        222min222min

                                                                                                                                                                                                                                                        2

                                                                                                                                                                                                                                                        2

                                                                                                                                                                                                                                                        110

                                                                                                                                                                                                                                                        110

                                                                                                                                                                                                                                                        SO

                                                                                                                                                                                                                                                        SOkk

                                                                                                                                                                                                                                                        SOk

                                                                                                                                                                                                                                                        SOkOSSSkK

                                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                                        k

                                                                                                                                                                                                                                                        L

                                                                                                                                                                                                                                                        L

                                                                                                                                                                                                                                                        Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                                        3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                        HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                        1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                        Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                        Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                        Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                        3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                        HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                        02468

                                                                                                                                                                                                                                                        101214161820

                                                                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                                                                        persistent

                                                                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                                                                        For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                        3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                        HTTP Response time (in seconds)

                                                                                                                                                                                                                                                        0

                                                                                                                                                                                                                                                        10

                                                                                                                                                                                                                                                        20

                                                                                                                                                                                                                                                        30

                                                                                                                                                                                                                                                        40

                                                                                                                                                                                                                                                        50

                                                                                                                                                                                                                                                        60

                                                                                                                                                                                                                                                        70

                                                                                                                                                                                                                                                        28Kbps

                                                                                                                                                                                                                                                        100Kbps

                                                                                                                                                                                                                                                        1 Mbps 10Mbps

                                                                                                                                                                                                                                                        non-persistent

                                                                                                                                                                                                                                                        persistent

                                                                                                                                                                                                                                                        parallel non-persistent

                                                                                                                                                                                                                                                        RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                        For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                        3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                        Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                        multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                        instantiation and implementation in the Internet

                                                                                                                                                                                                                                                        UDPTCP

                                                                                                                                                                                                                                                        Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                        • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • Transport services and protocols
                                                                                                                                                                                                                                                        • Transport vs network layer
                                                                                                                                                                                                                                                        • Transport-layer protocols
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                        • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                        • How demultiplexing works
                                                                                                                                                                                                                                                        • Connectionless demultiplexing
                                                                                                                                                                                                                                                        • Connectionless demux (cont)
                                                                                                                                                                                                                                                        • Connection-oriented demux
                                                                                                                                                                                                                                                        • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                        • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                        • UDP more
                                                                                                                                                                                                                                                        • UDP checksum
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • Principles of Reliable data transfer
                                                                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                                                                        • Reliable data transfer getting started
                                                                                                                                                                                                                                                        • Incremental Improvements
                                                                                                                                                                                                                                                        • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                        • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                        • rdt20 FSM specification
                                                                                                                                                                                                                                                        • rdt20 operation with no errors
                                                                                                                                                                                                                                                        • rdt20 error scenario
                                                                                                                                                                                                                                                        • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                        • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                        • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                        • rdt21 discussion
                                                                                                                                                                                                                                                        • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                        • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                        • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                        • rdt30 sender
                                                                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                                                                        • rdt30 in action
                                                                                                                                                                                                                                                        • Performance of rdt30
                                                                                                                                                                                                                                                        • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                                                                        • Pipelined protocols
                                                                                                                                                                                                                                                        • Pipelining increased utilization
                                                                                                                                                                                                                                                        • Go-Back-N
                                                                                                                                                                                                                                                        • GBN Sender
                                                                                                                                                                                                                                                        • GBN sender extended FSM
                                                                                                                                                                                                                                                        • GBN receiver extended FSM
                                                                                                                                                                                                                                                        • More on receiver
                                                                                                                                                                                                                                                        • GBN inaction
                                                                                                                                                                                                                                                        • Selective Repeat
                                                                                                                                                                                                                                                        • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                        • Selective repeat
                                                                                                                                                                                                                                                        • Selective repeat in action
                                                                                                                                                                                                                                                        • Selective repeat dilemma
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                        • More TCP Details
                                                                                                                                                                                                                                                        • Even More TCP Details
                                                                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                                                                        • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                        • Example RTT estimation
                                                                                                                                                                                                                                                        • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • TCP reliable data transfer
                                                                                                                                                                                                                                                        • TCP sender events
                                                                                                                                                                                                                                                        • TCP sender(simplified)
                                                                                                                                                                                                                                                        • TCP retransmission scenarios
                                                                                                                                                                                                                                                        • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                        • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                        • More on Sender Policies
                                                                                                                                                                                                                                                        • Fast Retransmit
                                                                                                                                                                                                                                                        • Fast retransmit algorithm
                                                                                                                                                                                                                                                        • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                                                                        • TCP Flow Control
                                                                                                                                                                                                                                                        • TCP segment structure
                                                                                                                                                                                                                                                        • TCP Flow control how it works
                                                                                                                                                                                                                                                        • Technical Issue
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • TCP Connection Management
                                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                                        • TCP Connection Management (cont)
                                                                                                                                                                                                                                                        • A few special cases
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • Principles of Congestion Control
                                                                                                                                                                                                                                                        • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                        • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                        • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                        • Approaches towards congestion control
                                                                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                        • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                        • Chapter 3 outline
                                                                                                                                                                                                                                                        • TCP Congestion Control
                                                                                                                                                                                                                                                        • TCP AIMD
                                                                                                                                                                                                                                                        • TCP Slow Start
                                                                                                                                                                                                                                                        • TCP Slow Start (more)
                                                                                                                                                                                                                                                        • Summary TCP Congestion Control
                                                                                                                                                                                                                                                        • The Big Picture
                                                                                                                                                                                                                                                        • TCP sender congestion control
                                                                                                                                                                                                                                                        • TCP throughput
                                                                                                                                                                                                                                                        • TCP Futures
                                                                                                                                                                                                                                                        • TCP Fairness
                                                                                                                                                                                                                                                        • Why is TCP fair
                                                                                                                                                                                                                                                        • Fairness (more)
                                                                                                                                                                                                                                                        • TCP Latency Modeling
                                                                                                                                                                                                                                                        • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                        • Fixed congestion window (1)
                                                                                                                                                                                                                                                        • Fixed congestion window (2)
                                                                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                        • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                        • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                        • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                        • HTTP Modeling
                                                                                                                                                                                                                                                        • Chapter 3 Summary

                                                                                                                                                                                                                                                          3 Transport Layer 125Comp 361 Spring 2005

                                                                                                                                                                                                                                                          TCP Latency Modeling (4)Recall K = number of windows that cover object

                                                                                                                                                                                                                                                          How do we calculate K

                                                                                                                                                                                                                                                          ⎥⎥⎤

                                                                                                                                                                                                                                                          ⎢⎢⎡ +=

                                                                                                                                                                                                                                                          +ge=

                                                                                                                                                                                                                                                          geminus=

                                                                                                                                                                                                                                                          ge+++=

                                                                                                                                                                                                                                                          ge+++=minus

                                                                                                                                                                                                                                                          minus

                                                                                                                                                                                                                                                          )1(log

                                                                                                                                                                                                                                                          )1(logmin

                                                                                                                                                                                                                                                          12min

                                                                                                                                                                                                                                                          222min222min

                                                                                                                                                                                                                                                          2

                                                                                                                                                                                                                                                          2

                                                                                                                                                                                                                                                          110

                                                                                                                                                                                                                                                          110

                                                                                                                                                                                                                                                          SO

                                                                                                                                                                                                                                                          SOkk

                                                                                                                                                                                                                                                          SOk

                                                                                                                                                                                                                                                          SOkOSSSkK

                                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                                          k

                                                                                                                                                                                                                                                          L

                                                                                                                                                                                                                                                          L

                                                                                                                                                                                                                                                          Calculation of Q number of idles for infinite-size objectis similar

                                                                                                                                                                                                                                                          3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                          HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                          1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                          Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                          Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                          Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                          3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                          HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                          02468

                                                                                                                                                                                                                                                          101214161820

                                                                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                                                                          persistent

                                                                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                                                                          For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                          3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                          HTTP Response time (in seconds)

                                                                                                                                                                                                                                                          0

                                                                                                                                                                                                                                                          10

                                                                                                                                                                                                                                                          20

                                                                                                                                                                                                                                                          30

                                                                                                                                                                                                                                                          40

                                                                                                                                                                                                                                                          50

                                                                                                                                                                                                                                                          60

                                                                                                                                                                                                                                                          70

                                                                                                                                                                                                                                                          28Kbps

                                                                                                                                                                                                                                                          100Kbps

                                                                                                                                                                                                                                                          1 Mbps 10Mbps

                                                                                                                                                                                                                                                          non-persistent

                                                                                                                                                                                                                                                          persistent

                                                                                                                                                                                                                                                          parallel non-persistent

                                                                                                                                                                                                                                                          RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                          For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                          3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                          Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                          multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                          instantiation and implementation in the Internet

                                                                                                                                                                                                                                                          UDPTCP

                                                                                                                                                                                                                                                          Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                          • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • Transport services and protocols
                                                                                                                                                                                                                                                          • Transport vs network layer
                                                                                                                                                                                                                                                          • Transport-layer protocols
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                          • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                          • How demultiplexing works
                                                                                                                                                                                                                                                          • Connectionless demultiplexing
                                                                                                                                                                                                                                                          • Connectionless demux (cont)
                                                                                                                                                                                                                                                          • Connection-oriented demux
                                                                                                                                                                                                                                                          • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                          • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                          • UDP more
                                                                                                                                                                                                                                                          • UDP checksum
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • Principles of Reliable data transfer
                                                                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                                                                          • Reliable data transfer getting started
                                                                                                                                                                                                                                                          • Incremental Improvements
                                                                                                                                                                                                                                                          • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                          • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                          • rdt20 FSM specification
                                                                                                                                                                                                                                                          • rdt20 operation with no errors
                                                                                                                                                                                                                                                          • rdt20 error scenario
                                                                                                                                                                                                                                                          • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                          • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                          • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                          • rdt21 discussion
                                                                                                                                                                                                                                                          • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                          • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                          • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                          • rdt30 sender
                                                                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                                                                          • rdt30 in action
                                                                                                                                                                                                                                                          • Performance of rdt30
                                                                                                                                                                                                                                                          • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                                                                          • Pipelined protocols
                                                                                                                                                                                                                                                          • Pipelining increased utilization
                                                                                                                                                                                                                                                          • Go-Back-N
                                                                                                                                                                                                                                                          • GBN Sender
                                                                                                                                                                                                                                                          • GBN sender extended FSM
                                                                                                                                                                                                                                                          • GBN receiver extended FSM
                                                                                                                                                                                                                                                          • More on receiver
                                                                                                                                                                                                                                                          • GBN inaction
                                                                                                                                                                                                                                                          • Selective Repeat
                                                                                                                                                                                                                                                          • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                          • Selective repeat
                                                                                                                                                                                                                                                          • Selective repeat in action
                                                                                                                                                                                                                                                          • Selective repeat dilemma
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                          • More TCP Details
                                                                                                                                                                                                                                                          • Even More TCP Details
                                                                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                                                                          • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                          • Example RTT estimation
                                                                                                                                                                                                                                                          • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • TCP reliable data transfer
                                                                                                                                                                                                                                                          • TCP sender events
                                                                                                                                                                                                                                                          • TCP sender(simplified)
                                                                                                                                                                                                                                                          • TCP retransmission scenarios
                                                                                                                                                                                                                                                          • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                          • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                          • More on Sender Policies
                                                                                                                                                                                                                                                          • Fast Retransmit
                                                                                                                                                                                                                                                          • Fast retransmit algorithm
                                                                                                                                                                                                                                                          • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                                                                          • TCP Flow Control
                                                                                                                                                                                                                                                          • TCP segment structure
                                                                                                                                                                                                                                                          • TCP Flow control how it works
                                                                                                                                                                                                                                                          • Technical Issue
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • TCP Connection Management
                                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                                          • TCP Connection Management (cont)
                                                                                                                                                                                                                                                          • A few special cases
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • Principles of Congestion Control
                                                                                                                                                                                                                                                          • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                          • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                          • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                          • Approaches towards congestion control
                                                                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                          • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                          • Chapter 3 outline
                                                                                                                                                                                                                                                          • TCP Congestion Control
                                                                                                                                                                                                                                                          • TCP AIMD
                                                                                                                                                                                                                                                          • TCP Slow Start
                                                                                                                                                                                                                                                          • TCP Slow Start (more)
                                                                                                                                                                                                                                                          • Summary TCP Congestion Control
                                                                                                                                                                                                                                                          • The Big Picture
                                                                                                                                                                                                                                                          • TCP sender congestion control
                                                                                                                                                                                                                                                          • TCP throughput
                                                                                                                                                                                                                                                          • TCP Futures
                                                                                                                                                                                                                                                          • TCP Fairness
                                                                                                                                                                                                                                                          • Why is TCP fair
                                                                                                                                                                                                                                                          • Fairness (more)
                                                                                                                                                                                                                                                          • TCP Latency Modeling
                                                                                                                                                                                                                                                          • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                          • Fixed congestion window (1)
                                                                                                                                                                                                                                                          • Fixed congestion window (2)
                                                                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                          • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                          • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                          • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                          • HTTP Modeling
                                                                                                                                                                                                                                                          • Chapter 3 Summary

                                                                                                                                                                                                                                                            3 Transport Layer 126Comp 361 Spring 2005

                                                                                                                                                                                                                                                            HTTP ModelingAssume Web page consists of

                                                                                                                                                                                                                                                            1 base HTML page (of size O bits)M images (each of size O bits)

                                                                                                                                                                                                                                                            Non-persistent HTTP M+1 TCP connections in seriesResponse time = (M+1)OR + (M+1)2RTT + sum of idle times

                                                                                                                                                                                                                                                            Persistent HTTP2 RTT to request and receive base HTML file1 RTT to request and receive M imagesResponse time = (M+1)OR + 3RTT + sum of idle times

                                                                                                                                                                                                                                                            Non-persistent HTTP with X parallel connectionsSuppose MX integer1 TCP connection for base fileMX sets of parallel connections for imagesResponse time = (M+1)OR + (MX + 1)2RTT + sum of idle times

                                                                                                                                                                                                                                                            3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                            HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                            02468

                                                                                                                                                                                                                                                            101214161820

                                                                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                                                                            persistent

                                                                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                                                                            For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                            3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                            HTTP Response time (in seconds)

                                                                                                                                                                                                                                                            0

                                                                                                                                                                                                                                                            10

                                                                                                                                                                                                                                                            20

                                                                                                                                                                                                                                                            30

                                                                                                                                                                                                                                                            40

                                                                                                                                                                                                                                                            50

                                                                                                                                                                                                                                                            60

                                                                                                                                                                                                                                                            70

                                                                                                                                                                                                                                                            28Kbps

                                                                                                                                                                                                                                                            100Kbps

                                                                                                                                                                                                                                                            1 Mbps 10Mbps

                                                                                                                                                                                                                                                            non-persistent

                                                                                                                                                                                                                                                            persistent

                                                                                                                                                                                                                                                            parallel non-persistent

                                                                                                                                                                                                                                                            RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                            For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                            3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                            Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                            multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                            instantiation and implementation in the Internet

                                                                                                                                                                                                                                                            UDPTCP

                                                                                                                                                                                                                                                            Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                            • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • Transport services and protocols
                                                                                                                                                                                                                                                            • Transport vs network layer
                                                                                                                                                                                                                                                            • Transport-layer protocols
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                            • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                            • How demultiplexing works
                                                                                                                                                                                                                                                            • Connectionless demultiplexing
                                                                                                                                                                                                                                                            • Connectionless demux (cont)
                                                                                                                                                                                                                                                            • Connection-oriented demux
                                                                                                                                                                                                                                                            • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                            • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                            • UDP more
                                                                                                                                                                                                                                                            • UDP checksum
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • Principles of Reliable data transfer
                                                                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                                                                            • Reliable data transfer getting started
                                                                                                                                                                                                                                                            • Incremental Improvements
                                                                                                                                                                                                                                                            • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                            • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                            • rdt20 FSM specification
                                                                                                                                                                                                                                                            • rdt20 operation with no errors
                                                                                                                                                                                                                                                            • rdt20 error scenario
                                                                                                                                                                                                                                                            • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                            • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                            • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                            • rdt21 discussion
                                                                                                                                                                                                                                                            • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                            • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                            • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                            • rdt30 sender
                                                                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                                                                            • rdt30 in action
                                                                                                                                                                                                                                                            • Performance of rdt30
                                                                                                                                                                                                                                                            • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                                                                            • Pipelined protocols
                                                                                                                                                                                                                                                            • Pipelining increased utilization
                                                                                                                                                                                                                                                            • Go-Back-N
                                                                                                                                                                                                                                                            • GBN Sender
                                                                                                                                                                                                                                                            • GBN sender extended FSM
                                                                                                                                                                                                                                                            • GBN receiver extended FSM
                                                                                                                                                                                                                                                            • More on receiver
                                                                                                                                                                                                                                                            • GBN inaction
                                                                                                                                                                                                                                                            • Selective Repeat
                                                                                                                                                                                                                                                            • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                            • Selective repeat
                                                                                                                                                                                                                                                            • Selective repeat in action
                                                                                                                                                                                                                                                            • Selective repeat dilemma
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                            • More TCP Details
                                                                                                                                                                                                                                                            • Even More TCP Details
                                                                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                                                                            • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                            • Example RTT estimation
                                                                                                                                                                                                                                                            • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • TCP reliable data transfer
                                                                                                                                                                                                                                                            • TCP sender events
                                                                                                                                                                                                                                                            • TCP sender(simplified)
                                                                                                                                                                                                                                                            • TCP retransmission scenarios
                                                                                                                                                                                                                                                            • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                            • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                            • More on Sender Policies
                                                                                                                                                                                                                                                            • Fast Retransmit
                                                                                                                                                                                                                                                            • Fast retransmit algorithm
                                                                                                                                                                                                                                                            • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                                                                            • TCP Flow Control
                                                                                                                                                                                                                                                            • TCP segment structure
                                                                                                                                                                                                                                                            • TCP Flow control how it works
                                                                                                                                                                                                                                                            • Technical Issue
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • TCP Connection Management
                                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                                            • TCP Connection Management (cont)
                                                                                                                                                                                                                                                            • A few special cases
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • Principles of Congestion Control
                                                                                                                                                                                                                                                            • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                            • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                            • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                            • Approaches towards congestion control
                                                                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                            • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                            • Chapter 3 outline
                                                                                                                                                                                                                                                            • TCP Congestion Control
                                                                                                                                                                                                                                                            • TCP AIMD
                                                                                                                                                                                                                                                            • TCP Slow Start
                                                                                                                                                                                                                                                            • TCP Slow Start (more)
                                                                                                                                                                                                                                                            • Summary TCP Congestion Control
                                                                                                                                                                                                                                                            • The Big Picture
                                                                                                                                                                                                                                                            • TCP sender congestion control
                                                                                                                                                                                                                                                            • TCP throughput
                                                                                                                                                                                                                                                            • TCP Futures
                                                                                                                                                                                                                                                            • TCP Fairness
                                                                                                                                                                                                                                                            • Why is TCP fair
                                                                                                                                                                                                                                                            • Fairness (more)
                                                                                                                                                                                                                                                            • TCP Latency Modeling
                                                                                                                                                                                                                                                            • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                            • Fixed congestion window (1)
                                                                                                                                                                                                                                                            • Fixed congestion window (2)
                                                                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                            • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                            • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                            • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                            • HTTP Modeling
                                                                                                                                                                                                                                                            • Chapter 3 Summary

                                                                                                                                                                                                                                                              3 Transport Layer 127Comp 361 Spring 2005

                                                                                                                                                                                                                                                              HTTP Response time (in seconds)RTT = 100 msec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                              02468

                                                                                                                                                                                                                                                              101214161820

                                                                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                                                                              persistent

                                                                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                                                                              For low bandwidth connection amp response time dominated by transmission timePersistent connections only give minor improvement over parallelconnections

                                                                                                                                                                                                                                                              3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                              HTTP Response time (in seconds)

                                                                                                                                                                                                                                                              0

                                                                                                                                                                                                                                                              10

                                                                                                                                                                                                                                                              20

                                                                                                                                                                                                                                                              30

                                                                                                                                                                                                                                                              40

                                                                                                                                                                                                                                                              50

                                                                                                                                                                                                                                                              60

                                                                                                                                                                                                                                                              70

                                                                                                                                                                                                                                                              28Kbps

                                                                                                                                                                                                                                                              100Kbps

                                                                                                                                                                                                                                                              1 Mbps 10Mbps

                                                                                                                                                                                                                                                              non-persistent

                                                                                                                                                                                                                                                              persistent

                                                                                                                                                                                                                                                              parallel non-persistent

                                                                                                                                                                                                                                                              RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                              For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                              3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                              Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                              multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                              instantiation and implementation in the Internet

                                                                                                                                                                                                                                                              UDPTCP

                                                                                                                                                                                                                                                              Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                              • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • Transport services and protocols
                                                                                                                                                                                                                                                              • Transport vs network layer
                                                                                                                                                                                                                                                              • Transport-layer protocols
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                              • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                              • How demultiplexing works
                                                                                                                                                                                                                                                              • Connectionless demultiplexing
                                                                                                                                                                                                                                                              • Connectionless demux (cont)
                                                                                                                                                                                                                                                              • Connection-oriented demux
                                                                                                                                                                                                                                                              • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                              • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                              • UDP more
                                                                                                                                                                                                                                                              • UDP checksum
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • Principles of Reliable data transfer
                                                                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                                                                              • Reliable data transfer getting started
                                                                                                                                                                                                                                                              • Incremental Improvements
                                                                                                                                                                                                                                                              • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                              • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                              • rdt20 FSM specification
                                                                                                                                                                                                                                                              • rdt20 operation with no errors
                                                                                                                                                                                                                                                              • rdt20 error scenario
                                                                                                                                                                                                                                                              • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                              • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                              • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                              • rdt21 discussion
                                                                                                                                                                                                                                                              • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                              • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                              • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                              • rdt30 sender
                                                                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                                                                              • rdt30 in action
                                                                                                                                                                                                                                                              • Performance of rdt30
                                                                                                                                                                                                                                                              • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                                                                              • Pipelined protocols
                                                                                                                                                                                                                                                              • Pipelining increased utilization
                                                                                                                                                                                                                                                              • Go-Back-N
                                                                                                                                                                                                                                                              • GBN Sender
                                                                                                                                                                                                                                                              • GBN sender extended FSM
                                                                                                                                                                                                                                                              • GBN receiver extended FSM
                                                                                                                                                                                                                                                              • More on receiver
                                                                                                                                                                                                                                                              • GBN inaction
                                                                                                                                                                                                                                                              • Selective Repeat
                                                                                                                                                                                                                                                              • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                              • Selective repeat
                                                                                                                                                                                                                                                              • Selective repeat in action
                                                                                                                                                                                                                                                              • Selective repeat dilemma
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                              • More TCP Details
                                                                                                                                                                                                                                                              • Even More TCP Details
                                                                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                                                                              • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                              • Example RTT estimation
                                                                                                                                                                                                                                                              • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • TCP reliable data transfer
                                                                                                                                                                                                                                                              • TCP sender events
                                                                                                                                                                                                                                                              • TCP sender(simplified)
                                                                                                                                                                                                                                                              • TCP retransmission scenarios
                                                                                                                                                                                                                                                              • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                              • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                              • More on Sender Policies
                                                                                                                                                                                                                                                              • Fast Retransmit
                                                                                                                                                                                                                                                              • Fast retransmit algorithm
                                                                                                                                                                                                                                                              • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                                                                              • TCP Flow Control
                                                                                                                                                                                                                                                              • TCP segment structure
                                                                                                                                                                                                                                                              • TCP Flow control how it works
                                                                                                                                                                                                                                                              • Technical Issue
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • TCP Connection Management
                                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                                              • TCP Connection Management (cont)
                                                                                                                                                                                                                                                              • A few special cases
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • Principles of Congestion Control
                                                                                                                                                                                                                                                              • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                              • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                              • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                              • Approaches towards congestion control
                                                                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                              • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                              • Chapter 3 outline
                                                                                                                                                                                                                                                              • TCP Congestion Control
                                                                                                                                                                                                                                                              • TCP AIMD
                                                                                                                                                                                                                                                              • TCP Slow Start
                                                                                                                                                                                                                                                              • TCP Slow Start (more)
                                                                                                                                                                                                                                                              • Summary TCP Congestion Control
                                                                                                                                                                                                                                                              • The Big Picture
                                                                                                                                                                                                                                                              • TCP sender congestion control
                                                                                                                                                                                                                                                              • TCP throughput
                                                                                                                                                                                                                                                              • TCP Futures
                                                                                                                                                                                                                                                              • TCP Fairness
                                                                                                                                                                                                                                                              • Why is TCP fair
                                                                                                                                                                                                                                                              • Fairness (more)
                                                                                                                                                                                                                                                              • TCP Latency Modeling
                                                                                                                                                                                                                                                              • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                              • Fixed congestion window (1)
                                                                                                                                                                                                                                                              • Fixed congestion window (2)
                                                                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                              • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                              • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                              • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                              • HTTP Modeling
                                                                                                                                                                                                                                                              • Chapter 3 Summary

                                                                                                                                                                                                                                                                3 Transport Layer 128Comp 361 Spring 2005

                                                                                                                                                                                                                                                                HTTP Response time (in seconds)

                                                                                                                                                                                                                                                                0

                                                                                                                                                                                                                                                                10

                                                                                                                                                                                                                                                                20

                                                                                                                                                                                                                                                                30

                                                                                                                                                                                                                                                                40

                                                                                                                                                                                                                                                                50

                                                                                                                                                                                                                                                                60

                                                                                                                                                                                                                                                                70

                                                                                                                                                                                                                                                                28Kbps

                                                                                                                                                                                                                                                                100Kbps

                                                                                                                                                                                                                                                                1 Mbps 10Mbps

                                                                                                                                                                                                                                                                non-persistent

                                                                                                                                                                                                                                                                persistent

                                                                                                                                                                                                                                                                parallel non-persistent

                                                                                                                                                                                                                                                                RTT =1 sec O = 5 Kbytes M=10 and X=5

                                                                                                                                                                                                                                                                For larger RTT response time dominated by TCP establishment amp slow start delays Persistent connections now give important improvement particularly in high delaybullbandwidth networks

                                                                                                                                                                                                                                                                3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                                Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                                multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                                instantiation and implementation in the Internet

                                                                                                                                                                                                                                                                UDPTCP

                                                                                                                                                                                                                                                                Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                                • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • Transport services and protocols
                                                                                                                                                                                                                                                                • Transport vs network layer
                                                                                                                                                                                                                                                                • Transport-layer protocols
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                                • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                                • How demultiplexing works
                                                                                                                                                                                                                                                                • Connectionless demultiplexing
                                                                                                                                                                                                                                                                • Connectionless demux (cont)
                                                                                                                                                                                                                                                                • Connection-oriented demux
                                                                                                                                                                                                                                                                • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                                • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                                • UDP more
                                                                                                                                                                                                                                                                • UDP checksum
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • Principles of Reliable data transfer
                                                                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                                                                • Reliable data transfer getting started
                                                                                                                                                                                                                                                                • Incremental Improvements
                                                                                                                                                                                                                                                                • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                                • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                                • rdt20 FSM specification
                                                                                                                                                                                                                                                                • rdt20 operation with no errors
                                                                                                                                                                                                                                                                • rdt20 error scenario
                                                                                                                                                                                                                                                                • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                                • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                                • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                                • rdt21 discussion
                                                                                                                                                                                                                                                                • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                                • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                                • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                                • rdt30 sender
                                                                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                                                                • rdt30 in action
                                                                                                                                                                                                                                                                • Performance of rdt30
                                                                                                                                                                                                                                                                • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                                                                • Pipelined protocols
                                                                                                                                                                                                                                                                • Pipelining increased utilization
                                                                                                                                                                                                                                                                • Go-Back-N
                                                                                                                                                                                                                                                                • GBN Sender
                                                                                                                                                                                                                                                                • GBN sender extended FSM
                                                                                                                                                                                                                                                                • GBN receiver extended FSM
                                                                                                                                                                                                                                                                • More on receiver
                                                                                                                                                                                                                                                                • GBN inaction
                                                                                                                                                                                                                                                                • Selective Repeat
                                                                                                                                                                                                                                                                • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                                • Selective repeat
                                                                                                                                                                                                                                                                • Selective repeat in action
                                                                                                                                                                                                                                                                • Selective repeat dilemma
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                                • More TCP Details
                                                                                                                                                                                                                                                                • Even More TCP Details
                                                                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                                                                • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                                • Example RTT estimation
                                                                                                                                                                                                                                                                • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • TCP reliable data transfer
                                                                                                                                                                                                                                                                • TCP sender events
                                                                                                                                                                                                                                                                • TCP sender(simplified)
                                                                                                                                                                                                                                                                • TCP retransmission scenarios
                                                                                                                                                                                                                                                                • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                                • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                                • More on Sender Policies
                                                                                                                                                                                                                                                                • Fast Retransmit
                                                                                                                                                                                                                                                                • Fast retransmit algorithm
                                                                                                                                                                                                                                                                • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                                                                • TCP Flow Control
                                                                                                                                                                                                                                                                • TCP segment structure
                                                                                                                                                                                                                                                                • TCP Flow control how it works
                                                                                                                                                                                                                                                                • Technical Issue
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • TCP Connection Management
                                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                • A few special cases
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • Principles of Congestion Control
                                                                                                                                                                                                                                                                • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                                • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                                • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                                • Approaches towards congestion control
                                                                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                                • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                                • Chapter 3 outline
                                                                                                                                                                                                                                                                • TCP Congestion Control
                                                                                                                                                                                                                                                                • TCP AIMD
                                                                                                                                                                                                                                                                • TCP Slow Start
                                                                                                                                                                                                                                                                • TCP Slow Start (more)
                                                                                                                                                                                                                                                                • Summary TCP Congestion Control
                                                                                                                                                                                                                                                                • The Big Picture
                                                                                                                                                                                                                                                                • TCP sender congestion control
                                                                                                                                                                                                                                                                • TCP throughput
                                                                                                                                                                                                                                                                • TCP Futures
                                                                                                                                                                                                                                                                • TCP Fairness
                                                                                                                                                                                                                                                                • Why is TCP fair
                                                                                                                                                                                                                                                                • Fairness (more)
                                                                                                                                                                                                                                                                • TCP Latency Modeling
                                                                                                                                                                                                                                                                • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                                • Fixed congestion window (1)
                                                                                                                                                                                                                                                                • Fixed congestion window (2)
                                                                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                                • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                                • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                                • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                                • HTTP Modeling
                                                                                                                                                                                                                                                                • Chapter 3 Summary

                                                                                                                                                                                                                                                                  3 Transport Layer 129Comp 361 Spring 2005

                                                                                                                                                                                                                                                                  Chapter 3 Summaryprinciples behind transport layer services

                                                                                                                                                                                                                                                                  multiplexing demultiplexingreliable data transferflow controlcongestion control

                                                                                                                                                                                                                                                                  instantiation and implementation in the Internet

                                                                                                                                                                                                                                                                  UDPTCP

                                                                                                                                                                                                                                                                  Nextleaving the network ldquoedgerdquo (application transport layers)into the network ldquocorerdquo

                                                                                                                                                                                                                                                                  • Chapter 3 Transport Layer last revised 160305
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • Transport services and protocols
                                                                                                                                                                                                                                                                  • Transport vs network layer
                                                                                                                                                                                                                                                                  • Transport-layer protocols
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                                  • Multiplexingdemultiplexing
                                                                                                                                                                                                                                                                  • How demultiplexing works
                                                                                                                                                                                                                                                                  • Connectionless demultiplexing
                                                                                                                                                                                                                                                                  • Connectionless demux (cont)
                                                                                                                                                                                                                                                                  • Connection-oriented demux
                                                                                                                                                                                                                                                                  • Connection-oriented demux (cont)
                                                                                                                                                                                                                                                                  • Connection-oriented demux Threaded Web Server
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • UDP User Datagram Protocol [RFC 768]
                                                                                                                                                                                                                                                                  • UDP more
                                                                                                                                                                                                                                                                  • UDP checksum
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • Principles of Reliable data transfer
                                                                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                                                                  • Reliable data transfer getting started
                                                                                                                                                                                                                                                                  • Incremental Improvements
                                                                                                                                                                                                                                                                  • Rdt10 reliable transfer over a reliable channel
                                                                                                                                                                                                                                                                  • Rdt20 channel with bit errors
                                                                                                                                                                                                                                                                  • rdt20 FSM specification
                                                                                                                                                                                                                                                                  • rdt20 operation with no errors
                                                                                                                                                                                                                                                                  • rdt20 error scenario
                                                                                                                                                                                                                                                                  • rdt20 has a fatal flaw
                                                                                                                                                                                                                                                                  • rdt21 sender handles garbled ACKNAKs
                                                                                                                                                                                                                                                                  • rdt21 receiver handles garbled ACKNAKs
                                                                                                                                                                                                                                                                  • rdt21 discussion
                                                                                                                                                                                                                                                                  • rdt22 a NAK-free protocol
                                                                                                                                                                                                                                                                  • rdt22 sender receiver fragments
                                                                                                                                                                                                                                                                  • rdt30 channels with errors and loss
                                                                                                                                                                                                                                                                  • rdt30 sender
                                                                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                                                                  • rdt30 in action
                                                                                                                                                                                                                                                                  • Performance of rdt30
                                                                                                                                                                                                                                                                  • rdt30 stop-and-wait operation
                                                                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                                                                  • Pipelined protocols
                                                                                                                                                                                                                                                                  • Pipelining increased utilization
                                                                                                                                                                                                                                                                  • Go-Back-N
                                                                                                                                                                                                                                                                  • GBN Sender
                                                                                                                                                                                                                                                                  • GBN sender extended FSM
                                                                                                                                                                                                                                                                  • GBN receiver extended FSM
                                                                                                                                                                                                                                                                  • More on receiver
                                                                                                                                                                                                                                                                  • GBN inaction
                                                                                                                                                                                                                                                                  • Selective Repeat
                                                                                                                                                                                                                                                                  • Selective repeat sender receiver windows
                                                                                                                                                                                                                                                                  • Selective repeat
                                                                                                                                                                                                                                                                  • Selective repeat in action
                                                                                                                                                                                                                                                                  • Selective repeat dilemma
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • TCP Overview RFCs 793 1122 1323 2018 2581
                                                                                                                                                                                                                                                                  • More TCP Details
                                                                                                                                                                                                                                                                  • Even More TCP Details
                                                                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                                                                  • TCP seq rsquos and ACKs
                                                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                                  • Example RTT estimation
                                                                                                                                                                                                                                                                  • TCP Round Trip Time and Timeout
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • TCP reliable data transfer
                                                                                                                                                                                                                                                                  • TCP sender events
                                                                                                                                                                                                                                                                  • TCP sender(simplified)
                                                                                                                                                                                                                                                                  • TCP retransmission scenarios
                                                                                                                                                                                                                                                                  • TCP retransmission scenarios (more)
                                                                                                                                                                                                                                                                  • TCP ACK generation [RFC 1122 RFC 2581]
                                                                                                                                                                                                                                                                  • More on Sender Policies
                                                                                                                                                                                                                                                                  • Fast Retransmit
                                                                                                                                                                                                                                                                  • Fast retransmit algorithm
                                                                                                                                                                                                                                                                  • TCP GBN or Selective Repeat
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                                                                  • TCP Flow Control
                                                                                                                                                                                                                                                                  • TCP segment structure
                                                                                                                                                                                                                                                                  • TCP Flow control how it works
                                                                                                                                                                                                                                                                  • Technical Issue
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • TCP Connection Management
                                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                  • TCP Connection Management (cont)
                                                                                                                                                                                                                                                                  • A few special cases
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • Principles of Congestion Control
                                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 1
                                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 2
                                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                                  • Causescosts of congestion scenario 3
                                                                                                                                                                                                                                                                  • Approaches towards congestion control
                                                                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                                  • Case study ATM ABR congestion control
                                                                                                                                                                                                                                                                  • Chapter 3 outline
                                                                                                                                                                                                                                                                  • TCP Congestion Control
                                                                                                                                                                                                                                                                  • TCP AIMD
                                                                                                                                                                                                                                                                  • TCP Slow Start
                                                                                                                                                                                                                                                                  • TCP Slow Start (more)
                                                                                                                                                                                                                                                                  • Summary TCP Congestion Control
                                                                                                                                                                                                                                                                  • The Big Picture
                                                                                                                                                                                                                                                                  • TCP sender congestion control
                                                                                                                                                                                                                                                                  • TCP throughput
                                                                                                                                                                                                                                                                  • TCP Futures
                                                                                                                                                                                                                                                                  • TCP Fairness
                                                                                                                                                                                                                                                                  • Why is TCP fair
                                                                                                                                                                                                                                                                  • Fairness (more)
                                                                                                                                                                                                                                                                  • TCP Latency Modeling
                                                                                                                                                                                                                                                                  • Fixed Congestion Window (W)
                                                                                                                                                                                                                                                                  • Fixed congestion window (1)
                                                                                                                                                                                                                                                                  • Fixed congestion window (2)
                                                                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (1)
                                                                                                                                                                                                                                                                  • TCP Latency Modeling Slow Start (2)
                                                                                                                                                                                                                                                                  • TCP Latency Modeling (3)
                                                                                                                                                                                                                                                                  • TCP Latency Modeling (4)
                                                                                                                                                                                                                                                                  • HTTP Modeling
                                                                                                                                                                                                                                                                  • Chapter 3 Summary

                                                                                                                                                                                                                                                                    top related